Federated Learning is a distributed machine learning environment, which ensures that clients complete collaborative training without sharing private data, only by exchanging parameters. However, the data does not satisfy the same distribution and the computing resources of clients are different, which brings challenges to the related research. To better solve the above heterogeneous problems, we designed a novel federated learning method. The local model consists of the pre-trained model as the backbone and fully connected layers as the head. The backbone can extract features for the head, and the embedding vector of classes is shared between clients to optimize the head so that the local model can perform better. By sharing the embedding vector of classes, instead of parameters based on gradient space, clients can better adapt to private data, and it is more efficient in the communication between the server and clients. To better protect privacy, we proposed a privacy-preserving hybrid method to add noise to the embedding vector of classes, which has less impact on the local model performance under the premise of satisfying differential privacy. We conduct a comprehensive evaluation with other federated learning methods on the self-built vehicle dataset under non-independent identically distributed(Non-IID)
Over-the-air federated edge learning (Air-FEEL) is a communication-efficient framework for distributed machine learning using training data distributed at edge devices. This framework enables all edge devices to transmit model updates simultaneously over the entire available bandwidth, allowing for over-the-air aggregation. A one-bit digital over-the-air aggregation (OBDA) scheme has been recently proposed, featuring one-bit gradient quantization at edge devices and majority-voting based decoding at the edge server. However, the low-resolution one-bit gradient quantization slows down the model convergence and leads to performance degradation. On the other hand, the aggregation errors caused by fading channels in Air-FEEL is still remained to be solved. To address these issues, we propose the error-feedback one-bit broadband digital aggregation (EFOBDA) and an optimized power control policy. To this end, we first provide a theoretical analysis to evaluate the impact of error feedback on the convergence of FL with EFOBDA. The analytical results show that, by setting an appropriate feedback strength, EFOBDA is comparable to the Air-FEEL without quantization, thus enhancing the performance of OBDA. Then, we further introduce a power control policy by maximizing the convergence rate under instantaneous power constraints. The convergence analysis and optimized power control policy are verified by the experiments, which show that the proposed scheme achieves significantly faster convergence and higher test accuracy in image classification tasks compared with the one-bit quantization scheme without error feedback or optimized power control policy.
Federated Learning (FL) enables machine learning model training on distributed edge devices by aggregating local model updates rather than local data. However, privacy concerns arise as the FL server's access to local model updates can potentially reveal sensitive personal information by performing attacks like gradient inversion recovery. To address these concerns, privacy-preserving methods, such as Homomorphic Encryption (HE)-based approaches, have been proposed. Despite HE's post-quantum security advantages, its applications suffer from impractical overheads. In this paper, we present FedML-HE, the first practical system for efficient HE-based secure federated aggregation that provides a user/device-friendly deployment platform. FL-HE utilizes a novel universal overhead optimization scheme, significantly reducing both computation and communication overheads during deployment while providing customizable privacy guarantees. Our optimized system demonstrates considerable overhead reduction, particularly for large models (e.g., ~10x reduction for HE-federated training of ResNet-50 and ~40x reduction for BERT), demonstrating the potential for scalable HE-based FL deployment.
The traditional framework of federated learning (FL) requires each client to re-train their models in every iteration, making it infeasible for resource-constrained mobile devices to train deep-learning (DL) models. Split learning (SL) provides an alternative by using a centralized server to offload the computation of activations and gradients for a subset of the model but suffers from problems of slow convergence and lower accuracy. In this paper, we implement PFSL, a new framework of distributed split learning where a large number of thin clients perform transfer learning in parallel, starting with a pre-trained DL model without sharing their data or labels with a central server. We implement a lightweight step of personalization of client models to provide high performance for their respective data distributions. Furthermore, we evaluate performance fairness amongst clients under a work fairness constraint for various scenarios of non-i.i.d. data distributions and unequal sample sizes. Our accuracy far exceeds that of current SL algorithms and is very close to that of centralized learning on several real-life benchmarks. It has a very low computation cost compared to FL variants and promises to deliver the full benefits of DL to extremely thin, resource-constrained clients.
We propose Byzantine-robust federated learning protocols with nearly optimal statistical rates. In contrast to prior work, our proposed protocols improve the dimension dependence and achieve a tight statistical rate in terms of all the parameters for strongly convex losses. We benchmark against competing protocols and show the empirical superiority of the proposed protocols. Finally, we remark that our protocols with bucketing can be naturally combined with privacy-guaranteeing procedures to introduce security against a semi-honest server. The code for evaluation is provided in //github.com/wanglun1996/secure-robust-federated-learning.
Both Byzantine resilience and communication efficiency have attracted tremendous attention recently for their significance in edge federated learning. However, most existing algorithms may fail when dealing with real-world irregular data that behaves in a heavy-tailed manner. To address this issue, we study the stochastic convex and non-convex optimization problem for federated learning at edge and show how to handle heavy-tailed data while retaining the Byzantine resilience, communication efficiency and the optimal statistical error rates simultaneously. Specifically, we first present a Byzantine-resilient distributed gradient descent algorithm that can handle the heavy-tailed data and meanwhile converge under the standard assumptions. To reduce the communication overhead, we further propose another algorithm that incorporates gradient compression techniques to save communication costs during the learning process. Theoretical analysis shows that our algorithms achieve order-optimal statistical error rate in presence of Byzantine devices. Finally, we conduct extensive experiments on both synthetic and real-world datasets to verify the efficacy of our algorithms.
Vertical federated learning (VFL) is a distributed learning paradigm, where computing clients collectively train a model based on the partial features of the same set of samples they possess. Current research on VFL focuses on the case when samples are independent, but it rarely addresses an emerging scenario when samples are interrelated through a graph. For graph-structured data, graph neural networks (GNNs) are competitive machine learning models, but a naive implementation in the VFL setting causes a significant communication overhead. Moreover, the analysis of the training is faced with a challenge caused by the biased stochastic gradients. In this paper, we propose a model splitting method that splits a backbone GNN across the clients and the server and a communication-efficient algorithm, GLASU, to train such a model. GLASU adopts lazy aggregation and stale updates to skip aggregation when evaluating the model and skip feature exchanges during training, greatly reducing communication. We offer a theoretical analysis and conduct extensive numerical experiments on real-world datasets, showing that the proposed algorithm effectively trains a GNN model, whose performance matches that of the backbone GNN when trained in a centralized manner.
Federated Learning is a distributed machine-learning environment that allows clients to learn collaboratively without sharing private data. This is accomplished by exchanging parameters. However, the differences in data distributions and computing resources among clients make related studies difficult. To address these heterogeneous problems, we propose a novel Federated Learning method. Our method utilizes a pre-trained model as the backbone of the local model, with fully connected layers comprising the head. The backbone extracts features for the head, and the embedding vector of classes is shared between clients to improve the head and enhance the performance of the local model. By sharing the embedding vector of classes instead of gradient-based parameters, clients can better adapt to private data, and communication between the server and clients is more effective. To protect privacy, we propose a privacy-preserving hybrid method that adds noise to the embedding vector of classes. This method has a minimal effect on the performance of the local model when differential privacy is met. We conduct a comprehensive evaluation of our approach on a self-built vehicle dataset, comparing it with other Federated Learning methods under non-independent identically distributed(Non-IID).
Federated learning (FL) has been proposed to protect data privacy and virtually assemble the isolated data silos by cooperatively training models among organizations without breaching privacy and security. However, FL faces heterogeneity from various aspects, including data space, statistical, and system heterogeneity. For example, collaborative organizations without conflict of interest often come from different areas and have heterogeneous data from different feature spaces. Participants may also want to train heterogeneous personalized local models due to non-IID and imbalanced data distribution and various resource-constrained devices. Therefore, heterogeneous FL is proposed to address the problem of heterogeneity in FL. In this survey, we comprehensively investigate the domain of heterogeneous FL in terms of data space, statistical, system, and model heterogeneity. We first give an overview of FL, including its definition and categorization. Then, We propose a precise taxonomy of heterogeneous FL settings for each type of heterogeneity according to the problem setting and learning objective. We also investigate the transfer learning methodologies to tackle the heterogeneity in FL. We further present the applications of heterogeneous FL. Finally, we highlight the challenges and opportunities and envision promising future research directions toward new framework design and trustworthy approaches.
Federated Learning (FL) is a decentralized machine-learning paradigm, in which a global server iteratively averages the model parameters of local users without accessing their data. User heterogeneity has imposed significant challenges to FL, which can incur drifted global models that are slow to converge. Knowledge Distillation has recently emerged to tackle this issue, by refining the server model using aggregated knowledge from heterogeneous users, other than directly averaging their model parameters. This approach, however, depends on a proxy dataset, making it impractical unless such a prerequisite is satisfied. Moreover, the ensemble knowledge is not fully utilized to guide local model learning, which may in turn affect the quality of the aggregated model. Inspired by the prior art, we propose a data-free knowledge distillation} approach to address heterogeneous FL, where the server learns a lightweight generator to ensemble user information in a data-free manner, which is then broadcasted to users, regulating local training using the learned knowledge as an inductive bias. Empirical studies powered by theoretical implications show that, our approach facilitates FL with better generalization performance using fewer communication rounds, compared with the state-of-the-art.
Federated learning enables multiple parties to collaboratively train a machine learning model without communicating their local data. A key challenge in federated learning is to handle the heterogeneity of local data distribution across parties. Although many studies have been proposed to address this challenge, we find that they fail to achieve high performance in image datasets with deep learning models. In this paper, we propose MOON: model-contrastive federated learning. MOON is a simple and effective federated learning framework. The key idea of MOON is to utilize the similarity between model representations to correct the local training of individual parties, i.e., conducting contrastive learning in model-level. Our extensive experiments show that MOON significantly outperforms the other state-of-the-art federated learning algorithms on various image classification tasks.