国产一国产一级毛片A久久久_又大又黄又粗又色在线播放_九九九99久久精品国产_国产区亚洲综合在线观看_亚洲精品无码在线播放_又粗又大又爽在线视频播放_国产薄丝脚交在线视频

In a real federated learning (FL) system, communication overhead for passing model parameters between the clients and the parameter server (PS) is often a bottleneck. Hierarchical federated learning (HFL) that poses multiple edge servers (ESs) between clients and the PS can partially alleviate communication pressure but still needs the aggregation of model parameters from multiple ESs at the PS. To further reduce communication overhead, we bring sequential FL (SFL) into HFL for the first time, which removes the central PS and enables the model training to be completed only through passing the global model between two adjacent ESs for each iteration, and propose a novel algorithm adaptive to such a combinational framework, referred to as Fed-CHS. Convergence results are derived for strongly convex and non-convex loss functions under various data heterogeneity setups, which show comparable convergence performance with the algorithms for HFL or SFL solely. Experimental results provide evidence of the superiority of our proposed Fed-CHS on both communication overhead saving and test accuracy over baseline methods.

相關內容

聯邦學習

關注 199

聯(lian)邦學(xue)習（Federated Learning）是一種新興的(de)(de)人(ren)工智能基礎技術，在(zai) 2016 年(nian)由谷歌最先提出(chu)，原本用于(yu)解決(jue)安(an)卓手機(ji)終端(duan)用戶在(zai)本地(di)更新模型(xing)的(de)(de)問題，其設計目標(biao)是在(zai)保障大(da)數據交換時的(de)(de)信(xin)息安(an)全、保護終端(duan)數據和個人(ren)數據隱私、保證合法(fa)合規(gui)的(de)(de)前提下，在(zai)多(duo)參與方或多(duo)計算結點之間開展高效率的(de)(de)機(ji)器學(xue)習。其中，聯(lian)邦學(xue)習可使(shi)用的(de)(de)機(ji)器學(xue)習算法(fa)不局限于(yu)神經網(wang)絡，還包(bao)括隨機(ji)森林等重要(yao)算法(fa)。聯(lian)邦學(xue)習有望成(cheng)為下一代人(ren)工智能協同算法(fa)和協作網(wang)絡的(de)(de)基礎。

Tensor · Analysis · 矩陣論 · 可理解性 · Machine Learning ·

2024 年 9 月 28 日

Topological Eigenvalue Theorems for Tensor Analysis in Multi-Modal Data Fusion

Ronald Katende

This paper presents a novel framework for tensor eigenvalue analysis in the context of multi-modal data fusion, leveraging topological invariants such as Betti numbers. Traditional approaches to tensor eigenvalue analysis often extend matrix theory, whereas this work introduces a topological perspective to enhance the understanding of tensor structures. By establishing new theorems that link eigenvalues to topological features, the proposed framework provides deeper insights into the latent structure of data, improving both interpretability and robustness. Applications in data fusion demonstrate the theoretical and practical significance of this approach, with potential for broad impact in machine learning and data science.

MoDELS · Learning · 推斷 · Machine Learning · Performer ·

2024 年 9 月 26 日

Exponential Quantum Communication Advantage in Distributed Inference and Learning

Dar Gilboa,Hagay Michaeli,Daniel Soudry,Jarrod R. McClean

Training and inference with large machine learning models that far exceed the memory capacity of individual devices necessitates the design of distributed architectures, forcing one to contend with communication constraints. We present a framework for distributed computation over a quantum network in which data is encoded into specialized quantum states. We prove that for models within this framework, inference and training using gradient descent can be performed with exponentially less communication compared to their classical analogs, and with relatively modest overhead relative to standard gradient-based methods. We show that certain graph neural networks are particularly amenable to implementation within this framework, and moreover present empirical evidence that they perform well on standard benchmarks. To our knowledge, this is the first example of exponential quantum advantage for a generic class of machine learning problems that hold regardless of the data encoding cost. Moreover, we show that models in this class can encode highly nonlinear features of their inputs, and their expressivity increases exponentially with model depth. We also delineate the space of models for which exponential communication advantages hold by showing that they cannot hold for linear classification. Our results can be combined with natural privacy advantages in the communicated quantum states that limit the amount of information that can be extracted from them about the data and model parameters. Taken as a whole, these findings form a promising foundation for distributed machine learning over quantum networks.

contrastive · MoDELS · 語言模型化 · 知識 (knowledge) · Learning ·

2024 年 9 月 26 日

Contrastive Learning for Knowledge-Based Question Generation in Large Language Models

Zhenhong Zhang,Jiajing Chen,Weiyan Shi,Lingjie Yi,Chihang Wang,Qian Yu

from arxiv, 5 pages, 2 figures

With the rapid development of artificial intelligence technology, especially the increasingly widespread application of question-and-answer systems, high-quality question generation has become a key component in supporting the development of these systems. This article focuses on knowledge-based question generation technology, which aims to enable computers to simulate the human questioning process based on understanding specific texts or knowledge bases. In light of the issues of hallucination and knowledge gaps present in large-scale language models when applied to knowledge-intensive tasks, this paper proposes an enhanced question generation method that incorporates contrastive learning. This method utilizes multiple models to jointly mine domain knowledge and uses contrastive learning to guide the model in reducing noise and hallucinations in generation. Experimental results show that by designing prompts containing contrasting examples, the model's performance in question generation improves considerably, particularly when contrasting instructions and examples are used simultaneously, leading to the highest quality of generated questions and improved accuracy. These results demonstrate that the method proposed in this study, which combines contrasting context and chain-of-thought prompts, can effectively improve both the quality and the practicality of question generation.

Learning · 變換 · MoDELS · 感知機 · 情景 ·

2024 年 9 月 26 日

MLPs Learn In-Context on Regression and Classification Tasks

William L. Tong,Cengiz Pehlevan

from arxiv, 30 pages, 10 figures, code available at //github.com/wtong98/mlp-icl

In-context learning (ICL), the remarkable ability to solve a task from only input exemplars, is often assumed to be a unique hallmark of Transformer models. By examining commonly employed synthetic ICL tasks, we demonstrate that multi-layer perceptrons (MLPs) can also learn in-context. Moreover, MLPs, and the closely related MLP-Mixer models, learn in-context competitively with Transformers given the same compute budget in this setting. We further show that MLPs outperform Transformers on a series of classical tasks from psychology designed to test relational reasoning, which are closely related to in-context classification. These results underscore a need for studying in-context learning beyond attention-based architectures, while also challenging strong prior arguments about MLPs' limited ability to solve relational tasks. Altogether, our results highlight the unexpected competence of MLPs, and support the growing interest in all-MLP alternatives to task-specific architectures.

層 · Learning · 評論員 · MoDELS · 聯邦學習 ·

2024 年 9 月 26 日

Exploring Selective Layer Fine-Tuning in Federated Learning

Yuchang Sun,Yuexiang Xie,Bolin Ding,Yaliang Li,Jun Zhang

Federated learning (FL) has emerged as a promising paradigm for fine-tuning foundation models using distributed data in a privacy-preserving manner. Under limited computational resources, clients often find it more practical to fine-tune a selected subset of layers, rather than the entire model, based on their task-specific data. In this study, we provide a thorough theoretical exploration of selective layer fine-tuning in FL, emphasizing a flexible approach that allows the clients to adjust their selected layers according to their local data and resources. We theoretically demonstrate that the layer selection strategy has a significant impact on model convergence in two critical aspects: the importance of selected layers and the heterogeneous choices across clients. Drawing from these insights, we further propose a strategic layer selection method that utilizes local gradients and regulates layer selections across clients. The extensive experiments on both image and text datasets demonstrate the effectiveness of the proposed strategy compared with several baselines, highlighting its advances in identifying critical layers that adapt to the client heterogeneity and training dynamics in FL.

Networking · MoDELS · Learning · DirectShow · Performer ·

2024 年 9 月 25 日

A Hierarchical Gradient Tracking Algorithm for Mitigating Subnet-Drift in Fog Learning Networks

Evan Chen,Shiqiang Wang,Christopher G. Brinton

from arxiv, This paper is under review in IEEE/ACM Transactions on Networking

Federated learning (FL) encounters scalability challenges when implemented over fog networks that do not follow FL's conventional star topology architecture. Semi-decentralized FL (SD-FL) has proposed a solution for device-to-device (D2D) enabled networks that divides model cooperation into two stages: at the lower stage, D2D communications is employed for local model aggregations within subnetworks (subnets), while the upper stage handles device-server (DS) communications for global model aggregations. However, existing SD-FL schemes are based on gradient diversity assumptions that become performance bottlenecks as data distributions become more heterogeneous. In this work, we develop semi-decentralized gradient tracking (SD-GT), the first SD-FL methodology that removes the need for such assumptions by incorporating tracking terms into device updates for each communication layer. Our analytical characterization of SD-GT reveals upper bounds on convergence for non-convex, convex, and strongly-convex problems. We show how the bounds enable the development of an optimization algorithm that navigates the performance-efficiency trade-off by tuning subnet sampling rate and D2D rounds for each global training interval. Our subsequent numerical evaluations demonstrate that SD-GT obtains substantial improvements in trained model quality and communication cost relative to baselines in SD-FL and gradient tracking on several datasets.

泛化理論 · Performer · MoDELS · Learning · Extensibility ·

2024 年 9 月 25 日

Benchmarking Domain Generalization Algorithms in Computational Pathology

Neda Zamanitajeddin,Mostafa Jahanifar,Kesi Xu,Fouzia Siraj,Nasir Rajpoot

Deep learning models have shown immense promise in computational pathology (CPath) tasks, but their performance often suffers when applied to unseen data due to domain shifts. Addressing this requires domain generalization (DG) algorithms. However, a systematic evaluation of DG algorithms in the CPath context is lacking. This study aims to benchmark the effectiveness of 30 DG algorithms on 3 CPath tasks of varying difficulty through 7,560 cross-validation runs. We evaluate these algorithms using a unified and robust platform, incorporating modality-specific techniques and recent advances like pretrained foundation models. Our extensive cross-validation experiments provide insights into the relative performance of various DG strategies. We observe that self-supervised learning and stain augmentation consistently outperform other methods, highlighting the potential of pretrained models and data augmentation. Furthermore, we introduce a new pan-cancer tumor detection dataset (HISTOPANTUM) as a benchmark for future research. This study offers valuable guidance to researchers in selecting appropriate DG approaches for CPath tasks.

語言模型化 · 大語言模型 · MoDELS · Integration · 模型評估 ·

2024 年 4 月 17 日

A Survey on Retrieval-Augmented Text Generation for Large Language Models

Yizheng Huang,Jimmy Huang

from arxiv, Ongoing work

Retrieval-Augmented Generation (RAG) merges retrieval methods with deep learning advancements to address the static limitations of large language models (LLMs) by enabling the dynamic integration of up-to-date external information. This methodology, focusing primarily on the text domain, provides a cost-effective solution to the generation of plausible but incorrect responses by LLMs, thereby enhancing the accuracy and reliability of their outputs through the use of real-world data. As RAG grows in complexity and incorporates multiple concepts that can influence its performance, this paper organizes the RAG paradigm into four categories: pre-retrieval, retrieval, post-retrieval, and generation, offering a detailed perspective from the retrieval viewpoint. It outlines RAG's evolution and discusses the field's progression through the analysis of significant studies. Additionally, the paper introduces evaluation methods for RAG, addressing the challenges faced and proposing future research directions. By offering an organized framework and categorization, the study aims to consolidate existing research on RAG, clarify its technological underpinnings, and highlight its potential to broaden the adaptability and applications of LLMs.

數據增強 · Taxonomy · 文本分類 · Machine Learning · 訓練數據 ·

2021 年 7 月 7 日

A Survey on Data Augmentation for Text Classification

Markus Bayer,Marc-André Kaufhold,Christian Reuter

from arxiv, 35 pages, 6 figures, 8 tables

Data augmentation, the artificial creation of training data for machine learning by transformations, is a widely studied research field across machine learning disciplines. While it is useful for increasing the generalization capabilities of a model, it can also address many other challenges and problems, from overcoming a limited amount of training data over regularizing the objective to limiting the amount data used to protect privacy. Based on a precise description of the goals and applications of data augmentation (C1) and a taxonomy for existing works (C2), this survey is concerned with data augmentation methods for textual classification and aims to achieve a concise and comprehensive overview for researchers and practitioners (C3). Derived from the taxonomy, we divided more than 100 methods into 12 different groupings and provide state-of-the-art references expounding which methods are highly promising (C4). Finally, research perspectives that may constitute a building block for future work are given (C5).

蒸餾 · MoDELS · 聯邦學習 · 學成 · 歸納偏好 ·

2021 年 6 月 9 日

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Zhuangdi Zhu,Junyuan Hong,Jiayu Zhou

Federated Learning (FL) is a decentralized machine-learning paradigm, in which a global server iteratively averages the model parameters of local users without accessing their data. User heterogeneity has imposed significant challenges to FL, which can incur drifted global models that are slow to converge. Knowledge Distillation has recently emerged to tackle this issue, by refining the server model using aggregated knowledge from heterogeneous users, other than directly averaging their model parameters. This approach, however, depends on a proxy dataset, making it impractical unless such a prerequisite is satisfied. Moreover, the ensemble knowledge is not fully utilized to guide local model learning, which may in turn affect the quality of the aggregated model. Inspired by the prior art, we propose a data-free knowledge distillation} approach to address heterogeneous FL, where the server learns a lightweight generator to ensemble user information in a data-free manner, which is then broadcasted to users, regulating local training using the learned knowledge as an inductive bias. Empirical studies powered by theoretical implications show that, our approach facilitates FL with better generalization performance using fewer communication rounds, compared with the state-of-the-art.