亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Secure training, while protecting the confidentiality of both data and model weights, typically incurs significant training overhead. Traditional Fully Homomorphic Encryption (FHE)-based non-inter-active training models are heavily burdened by computationally demanding bootstrapping. To develop an efficient secure training system, we established a foundational framework, CryptoTrain-B, utilizing a hybrid cryptographic protocol that merges FHE with Oblivious Transfer (OT) for handling linear and non-linear operations, respectively. This integration eliminates the need for costly bootstrapping. Although CryptoTrain-B sets a new baseline in performance, reducing its training overhead remains essential. We found that ciphertext-ciphertext multiplication (CCMul) is a critical bottleneck in operations involving encrypted inputs and models. Our solution, the CCMul-Precompute technique, involves precomputing CCMul offline and resorting to the less resource-intensive ciphertext-plaintext multiplication (CPMul) during private training. Furthermore, conventional polynomial convolution in FHE systems tends to encode irrelevant and redundant values into polynomial slots, necessitating additional polynomials and ciphertexts for input representation and leading to extra multiplications. Addressing this, we introduce correlated polynomial convolution, which encodes only related input values into polynomials, thus drastically reducing the number of computations and overheads. By integrating CCMul-Precompute and correlated polynomial convolution into CryptoTrain-B, we facilitate a rapid and efficient secure training framework, CryptoTrain. Extensive experiments demonstrate that CryptoTrain achieves a ~5.3X training time reduction compared to prior methods.

相關內容

在(zai)數(shu)(shu)(shu)(shu)學(xue)(特別是(shi)功能分(fen)析(xi))中,卷積是(shi)對兩個(ge)函(han)(han)(han)數(shu)(shu)(shu)(shu)(f和g)的(de)數(shu)(shu)(shu)(shu)學(xue)運算,產(chan)生(sheng)三個(ge)函(han)(han)(han)數(shu)(shu)(shu)(shu),表示第一(yi)個(ge)函(han)(han)(han)數(shu)(shu)(shu)(shu)的(de)形狀如何被(bei)另一(yi)個(ge)函(han)(han)(han)數(shu)(shu)(shu)(shu)修改。 卷積一(yi)詞既指結果函(han)(han)(han)數(shu)(shu)(shu)(shu),又指計(ji)算結果的(de)過程。 它定義(yi)為(wei)兩個(ge)函(han)(han)(han)數(shu)(shu)(shu)(shu)的(de)乘積在(zai)一(yi)個(ge)函(han)(han)(han)數(shu)(shu)(shu)(shu)反轉和移位(wei)后(hou)的(de)積分(fen)。 并針(zhen)對所有shift值(zhi)評估(gu)積分(fen),從而生(sheng)成(cheng)卷積函(han)(han)(han)數(shu)(shu)(shu)(shu)。

Retrieval augmented generation (RAG) models, which integrate large-scale pre-trained generative models with external retrieval mechanisms, have shown significant success in various natural language processing (NLP) tasks. However, applying RAG models in Persian language as a low-resource language, poses distinct challenges. These challenges primarily involve the preprocessing, embedding, retrieval, prompt construction, language modeling, and response evaluation of the system. In this paper, we address the challenges towards implementing a real-world RAG system for Persian language called PersianRAG. We propose novel solutions to overcome these obstacles and evaluate our approach using several Persian benchmark datasets. Our experimental results demonstrate the capability of the PersianRAG framework to enhance question answering task in Persian.

As recommender systems become increasingly prevalent, the environmental impact and energy efficiency of training large-scale models have come under scrutiny. This paper investigates the potential for energy-efficient algorithm performance by optimizing dataset sizes through downsampling techniques in the context of Green Recommender Systems. We conducted experiments on the MovieLens 100K, 1M, 10M, and Amazon Toys and Games datasets, analyzing the performance of various recommender algorithms under different portions of dataset size. Our results indicate that while more training data generally leads to higher algorithm performance, certain algorithms, such as FunkSVD and BiasedMF, particularly with unbalanced and sparse datasets like Amazon Toys and Games, maintain high-quality recommendations with up to a 50% reduction in training data, achieving nDCG@10 scores within approximately 13% of full dataset performance. These findings suggest that strategic dataset reduction can decrease computational and environmental costs without substantially compromising recommendation quality. This study advances sustainable and green recommender systems by providing insights for reducing energy consumption while maintaining effectiveness.

Class incremental learning (CIL) trains a network on sequential tasks with separated categories in each task but suffers from catastrophic forgetting, where models quickly lose previously learned knowledge when acquiring new tasks. The generalized CIL (GCIL) aims to address the CIL problem in a more real-world scenario, where incoming data have mixed data categories and unknown sample size distribution. Existing attempts for the GCIL either have poor performance or invade data privacy by saving exemplars. In this paper, we propose a new exemplar-free GCIL technique named generalized analytic continual learning (GACL). The GACL adopts analytic learning (a gradient-free training technique) and delivers an analytical (i.e., closed-form) solution to the GCIL scenario. This solution is derived via decomposing the incoming data into exposed and unexposed classes, thereby attaining a weight-invariant property, a rare yet valuable property supporting an equivalence between incremental learning and its joint training. Such an equivalence is crucial in GCIL settings as data distributions among different tasks no longer pose challenges to adopting our GACL. Theoretically, this equivalence property is validated through matrix analysis tools. Empirically, we conduct extensive experiments where, compared with existing GCIL methods, our GACL exhibits a consistently leading performance across various datasets and GCIL settings. Source code is available at //github.com/CHEN-YIZHU/GACL.

Diffusion models excel in generating images that closely resemble their training data but are also susceptible to data memorization, raising privacy, ethical, and legal concerns, particularly in sensitive domains such as medical imaging. We hypothesize that this memorization stems from the overparameterization of deep models and propose that regularizing model capacity during fine-tuning can mitigate this issue. Firstly, we empirically show that regulating the model capacity via Parameter-efficient fine-tuning (PEFT) mitigates memorization to some extent, however, it further requires the identification of the exact parameter subsets to be fine-tuned for high-quality generation. To identify these subsets, we introduce a bi-level optimization framework, MemControl, that automates parameter selection using memorization and generation quality metrics as rewards during fine-tuning. The parameter subsets discovered through MemControl achieve a superior tradeoff between generation quality and memorization. For the task of medical image generation, our approach outperforms existing state-of-the-art memorization mitigation strategies by fine-tuning as few as 0.019% of model parameters. Moreover, we demonstrate that the discovered parameter subsets are transferable to non-medical domains. Our framework is scalable to large datasets, agnostic to reward functions, and can be integrated with existing approaches for further memorization mitigation. To the best of our knowledge, this is the first study to empirically evaluate memorization in medical images and propose a targeted yet universal mitigation strategy. The code is available at //github.com/Raman1121/Diffusion_Memorization_HPO

Synthetic data generation has become an increasingly popular way of training models without the need for large, manually labeled datasets. For tasks like text embedding, synthetic data offers diverse and scalable training examples, significantly reducing the cost of human annotation. However, most current approaches rely heavily on proprietary models like GPT-4, which are expensive and inefficient for generating large-scale embedding data. In this paper, we introduce SPEED, a framework that aligns open-source small models (8B) to efficiently generate large-scale synthetic embedding data. Through supervised fine-tuning, preference optimization, and self-improvement, SPEED enables small open-source models to produce high-quality data. Remarkably, SPEED uses only less than 1/10 of the GPT API calls, outperforming the state-of-the-art embedding model E5_mistral when both are trained solely on their synthetic data. Using this efficient generator, we conduct a comprehensive study on how various factors within the alignment pipeline impact data quality and reveal the scaling law for synthetic embedding data.

Optimizing the reaction to network events, which is critical in tasks such as clock synchronization, multicast, and routing, becomes increasingly challenging as networks grow larger. To improve the reaction time compared to centralized solutions, the theory community has made significant progress in the design of message-passing algorithms that leverage all nodes for distributed computation, and the advent of programmable switches makes it now possible to materialize them. We propose FRANCIS, a framework and associated libraries for running message-passing algorithms on programmable switches. It features primitives that allow easy integration of such algorithms for quickly reacting to network events while optimizing resource consumption. We use FRANCIS to implement event reaction solutions that improve clock synchronization, source-routed multicast, and routing and demonstrate up to 18x reduction in reaction time.

Despite extensive research on adversarial training strategies to improve robustness, the decisions of even the most robust deep learning models can still be quite sensitive to imperceptible perturbations, creating serious risks when deploying them for high-stakes real-world applications. While detecting such cases may be critical, evaluating a model's vulnerability at a per-instance level using adversarial attacks is computationally too intensive and unsuitable for real-time deployment scenarios. The input space margin is the exact score to detect non-robust samples and is intractable for deep neural networks. This paper introduces the concept of margin consistency -- a property that links the input space margins and the logit margins in robust models -- for efficient detection of vulnerable samples. First, we establish that margin consistency is a necessary and sufficient condition to use a model's logit margin as a score for identifying non-robust samples. Next, through comprehensive empirical analysis of various robustly trained models on CIFAR10 and CIFAR100 datasets, we show that they indicate high margin consistency with a strong correlation between their input space margins and the logit margins. Then, we show that we can effectively and confidently use the logit margin to detect brittle decisions with such models. Finally, we address cases where the model is not sufficiently margin-consistent by learning a pseudo-margin from the feature representation. Our findings highlight the potential of leveraging deep representations to assess adversarial vulnerability in deployment scenarios efficiently.

Generative models, as an important family of statistical modeling, target learning the observed data distribution via generating new instances. Along with the rise of neural networks, deep generative models, such as variational autoencoders (VAEs) and generative adversarial network (GANs), have made tremendous progress in 2D image synthesis. Recently, researchers switch their attentions from the 2D space to the 3D space considering that 3D data better aligns with our physical world and hence enjoys great potential in practice. However, unlike a 2D image, which owns an efficient representation (i.e., pixel grid) by nature, representing 3D data could face far more challenges. Concretely, we would expect an ideal 3D representation to be capable enough to model shapes and appearances in details, and to be highly efficient so as to model high-resolution data with fast speed and low memory cost. However, existing 3D representations, such as point clouds, meshes, and recent neural fields, usually fail to meet the above requirements simultaneously. In this survey, we make a thorough review of the development of 3D generation, including 3D shape generation and 3D-aware image synthesis, from the perspectives of both algorithms and more importantly representations. We hope that our discussion could help the community track the evolution of this field and further spark some innovative ideas to advance this challenging task.

Language model pre-training, such as BERT, has significantly improved the performances of many natural language processing tasks. However, pre-trained language models are usually computationally expensive and memory intensive, so it is difficult to effectively execute them on some resource-restricted devices. To accelerate inference and reduce model size while maintaining accuracy, we firstly propose a novel transformer distillation method that is a specially designed knowledge distillation (KD) method for transformer-based models. By leveraging this new KD method, the plenty of knowledge encoded in a large teacher BERT can be well transferred to a small student TinyBERT. Moreover, we introduce a new two-stage learning framework for TinyBERT, which performs transformer distillation at both the pre-training and task-specific learning stages. This framework ensures that TinyBERT can capture both the general-domain and task-specific knowledge of the teacher BERT. TinyBERT is empirically effective and achieves comparable results with BERT in GLUE datasets, while being 7.5x smaller and 9.4x faster on inference. TinyBERT is also significantly better than state-of-the-art baselines, even with only about 28% parameters and 31% inference time of baselines.

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy. In light of these pros and cons, we propose XLNet, a generalized autoregressive pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive formulation. Furthermore, XLNet integrates ideas from Transformer-XL, the state-of-the-art autoregressive model, into pretraining. Empirically, XLNet outperforms BERT on 20 tasks, often by a large margin, and achieves state-of-the-art results on 18 tasks including question answering, natural language inference, sentiment analysis, and document ranking.

北京阿比特科技有限公司