国产乱理伦片A级在线看,AV黄片在线免费看

Privacy regulation laws, such as GDPR, impose transparency and security as design pillars for data processing algorithms. In this context, federated learning is one of the most influential frameworks for privacy-preserving distributed machine learning, achieving astounding results in many natural language processing and computer vision tasks. Several federated learning frameworks employ differential privacy to prevent private data leakage to unauthorized parties and malicious attackers. Many studies, however, highlight the vulnerabilities of standard federated learning to poisoning and inference, thus raising concerns about potential risks for sensitive data. To address this issue, we present SGDE, a generative data exchange protocol that improves user security and machine learning performance in a cross-silo federation. The core of SGDE is to share data generators with strong differential privacy guarantees trained on private data instead of communicating explicit gradient information. These generators synthesize an arbitrarily large amount of data that retain the distinctive features of private samples but differ substantially. In this work, SGDE is tested in a cross-silo federated network on images and tabular datasets, exploiting beta-variational autoencoders as data generators. From the results, the inclusion of SGDE turns out to improve task accuracy and fairness, as well as resilience to the most influential attacks on federated learning.

相關內容

Learning

關注 12

泛化理論 · 模型評估 · 穩健性 · Learning · 遷移學習 ·

2022 年 10 月 21 日

Assaying Out-Of-Distribution Generalization in Transfer Learning

Florian Wenzel,Andrea Dittadi,Peter Vincent Gehler,Carl-Johann Simon-Gabriel,Max Horn,Dominik Zietlow,David Kernert,Chris Russell,Thomas Brox,Bernt Schiele,Bernhard Sch?lkopf,Francesco Locatello

Since out-of-distribution generalization is a generally ill-posed problem, various proxy targets (e.g., calibration, adversarial robustness, algorithmic corruptions, invariance across shifts) were studied across different research programs resulting in different recommendations. While sharing the same aspirational goal, these approaches have never been tested under the same experimental conditions on real data. In this paper, we take a unified view of previous work, highlighting message discrepancies that we address empirically, and providing recommendations on how to measure the robustness of a model and how to improve it. To this end, we collect 172 publicly available dataset pairs for training and out-of-distribution evaluation of accuracy, calibration error, adversarial attacks, environment invariance, and synthetic corruptions. We fine-tune over 31k networks, from nine different architectures in the many- and few-shot setting. Our findings confirm that in- and out-of-distribution accuracies tend to increase jointly, but show that their relation is largely dataset-dependent, and in general more nuanced and more complex than posited by previous, smaller scale studies.

統計量 · 樣本 · 多樣性 · 層 · Networking ·

2022 年 10 月 20 日

Diverse Sample Generation: Pushing the Limit of Generative Data-free Quantization

Haotong Qin,Yifu Ding,Xiangguo Zhang,Jiakai Wang,Xianglong Liu,Jiwen Lu

Generative data-free quantization emerges as a practical compression approach that quantizes deep neural networks to low bit-width without accessing the real data. This approach generates data utilizing batch normalization (BN) statistics of the full-precision networks to quantize the networks. However, it always faces the serious challenges of accuracy degradation in practice. We first give a theoretical analysis that the diversity of synthetic samples is crucial for the data-free quantization, while in existing approaches, the synthetic data completely constrained by BN statistics experimentally exhibit severe homogenization at distribution and sample levels. This paper presents a generic Diverse Sample Generation (DSG) scheme for the generative data-free quantization, to mitigate detrimental homogenization. We first slack the statistics alignment for features in the BN layer to relax the distribution constraint. Then, we strengthen the loss impact of the specific BN layers for different samples and inhibit the correlation among samples in the generation process, to diversify samples from the statistical and spatial perspectives, respectively. Comprehensive experiments show that for large-scale image classification tasks, our DSG can consistently quantization performance on different neural architectures, especially under ultra-low bit-width. And data diversification caused by our DSG brings a general gain to various quantization-aware training and post-training quantization approaches, demonstrating its generality and effectiveness.

采樣法 · 可約的 · INTERACT · 負例 · 穩健性 ·

2022 年 10 月 20 日

Federated Unlearning for On-Device Recommendation

Wei Yuan,Hongzhi Yin,Fangzhao Wu,Shijie Zhang,Tieke He,Hao Wang

The increasing data privacy concerns in recommendation systems have made federated recommendations (FedRecs) attract more and more attention. Existing FedRecs mainly focus on how to effectively and securely learn personal interests and preferences from their on-device interaction data. Still, none of them considers how to efficiently erase a user's contribution to the federated training process. We argue that such a dual setting is necessary. First, from the privacy protection perspective, ``the right to be forgotten'' requires that users have the right to withdraw their data contributions. Without the reversible ability, FedRecs risk breaking data protection regulations. On the other hand, enabling a FedRec to forget specific users can improve its robustness and resistance to malicious clients' attacks. To support user unlearning in FedRecs, we propose an efficient unlearning method FRU (Federated Recommendation Unlearning), inspired by the log-based rollback mechanism of transactions in database management systems. It removes a user's contribution by rolling back and calibrating the historical parameter updates and then uses these updates to speed up federated recommender reconstruction. However, storing all historical parameter updates on resource-constrained personal devices is challenging and even infeasible. In light of this challenge, we propose a small-sized negative sampling method to reduce the number of item embedding updates and an importance-based update selection mechanism to store only important model updates. To evaluate the effectiveness of FRU, we propose an attack method to disturb FedRecs via a group of compromised users and use FRU to recover recommenders by eliminating these users' influence. Finally, we conduct experiments on two real-world recommendation datasets with two widely used FedRecs to show the efficiency and effectiveness of our proposed approaches.

GANs · GaN · MoDELS · 情景 · 訓練數據 ·

2022 年 10 月 19 日

Backdoor Attack and Defense in Federated Generative Adversarial Network-based Medical Image Synthesis

Ruinan Jin,Xiaoxiao Li

from arxiv, 25 pages, 7 figures. arXiv admin note: text overlap with arXiv:2207.00762

Deep Learning-based image synthesis techniques have been applied in healthcare research for generating medical images to support open research and augment medical datasets. Training generative adversarial neural networks (GANs) usually require large amounts of training data. Federated learning (FL) provides a way of training a central model using distributed data while keeping raw data locally. However, given that the FL server cannot access the raw data, it is vulnerable to backdoor attacks, an adversarial by poisoning training data. Most backdoor attack strategies focus on classification models and centralized domains. It is still an open question if the existing backdoor attacks can affect GAN training and, if so, how to defend against the attack in the FL setting. In this work, we investigate the overlooked issue of backdoor attacks in federated GANs (FedGANs). The success of this attack is subsequently determined to be the result of some local discriminators overfitting the poisoned data and corrupting the local GAN equilibrium, which then further contaminates other clients when averaging the generator's parameters and yields high generator loss. Therefore, we proposed FedDetect, an efficient and effective way of defending against the backdoor attack in the FL setting, which allows the server to detect the client's adversarial behavior based on their losses and block the malicious clients. Our extensive experiments on two medical datasets with different modalities demonstrate the backdoor attack on FedGANs can result in synthetic images with low fidelity. After detecting and suppressing the detected malicious clients using the proposed defense strategy, we show that FedGANs can synthesize high-quality medical datasets (with labels) for data augmentation to improve classification models' performance.

求逆 · Learning · SimPLe · 聯邦學習 · MoDELS ·

2022 年 10 月 19 日

Learning to Invert: Simple Adaptive Attacks for Gradient Inversion in Federated Learning

Ruihan Wu,Xiangyu Chen,Chuan Guo,Kilian Q. Weinberger

Gradient inversion attack enables recovery of training samples from model updates in federated learning (FL) and constitutes a serious threat to data privacy. To mitigate this vulnerability, prior work proposed both principled defenses based on differential privacy, as well as heuristic defenses based on gradient compression as countermeasures. These defenses have so far been very effective, in particular those based on gradient compression that allow the model to maintain high accuracy while greatly reducing the attack's effectiveness. In this work, we argue that such findings do not accurately reflect the privacy risk in FL, and show that existing defenses can be broken by a simple adaptive attack that trains a model using auxiliary data to learn how to invert gradients on both vision and language tasks.

SSL · Learning · 穩健性 · MoDELS · 無監督 ·

2022 年 10 月 19 日

Targeted Adversarial Self-Supervised Learning

Minseon Kim,Hyeonjeong Ha,Sooel Son,Sung Ju Hwang

Recently, unsupervised adversarial training (AT) has been extensively studied to attain robustness with the models trained upon unlabeled data. To this end, previous studies have applied existing supervised adversarial training techniques to self-supervised learning (SSL) frameworks. However, all have resorted to untargeted adversarial learning as obtaining targeted adversarial examples is unclear in the SSL setting lacking of label information. In this paper, we propose a novel targeted adversarial training method for the SSL frameworks. Specifically, we propose a target selection algorithm for the adversarial SSL frameworks; it is designed to select the most confusing sample for each given instance based on similarity and entropy, and perturb the given instance toward the selected target sample. Our method significantly enhances the robustness of an SSL model without requiring large batches of images or additional models, unlike existing works aimed at achieving the same goal. Moreover, our method is readily applicable to general SSL frameworks that only uses positive pairs. We validate our method on benchmark datasets, on which it obtains superior robust accuracies, outperforming existing unsupervised adversarial training methods.

蒸餾 · MoDELS · 聯邦學習 · 學成 · 歸納偏好 ·

2021 年 6 月 9 日

Data-Free Knowledge Distillation for Heterogeneous Federated Learning

Zhuangdi Zhu,Junyuan Hong,Jiayu Zhou

Federated Learning (FL) is a decentralized machine-learning paradigm, in which a global server iteratively averages the model parameters of local users without accessing their data. User heterogeneity has imposed significant challenges to FL, which can incur drifted global models that are slow to converge. Knowledge Distillation has recently emerged to tackle this issue, by refining the server model using aggregated knowledge from heterogeneous users, other than directly averaging their model parameters. This approach, however, depends on a proxy dataset, making it impractical unless such a prerequisite is satisfied. Moreover, the ensemble knowledge is not fully utilized to guide local model learning, which may in turn affect the quality of the aggregated model. Inspired by the prior art, we propose a data-free knowledge distillation} approach to address heterogeneous FL, where the server learns a lightweight generator to ensemble user information in a data-free manner, which is then broadcasted to users, regulating local training using the learned knowledge as an inductive bias. Empirical studies powered by theoretical implications show that, our approach facilitates FL with better generalization performance using fewer communication rounds, compared with the state-of-the-art.

圖形處理器 · MoDELS · Networking · Neural Networks · 圖 ·

2021 年 6 月 9 日

Cross-Node Federated Graph Neural Network for Spatio-Temporal Data Modeling

Chuizheng Meng,Sirisha Rambhatla,Yan Liu

from arxiv, To be published in the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 21)

Vast amount of data generated from networks of sensors, wearables, and the Internet of Things (IoT) devices underscores the need for advanced modeling techniques that leverage the spatio-temporal structure of decentralized data due to the need for edge computation and licensing (data access) issues. While federated learning (FL) has emerged as a framework for model training without requiring direct data sharing and exchange, effectively modeling the complex spatio-temporal dependencies to improve forecasting capabilities still remains an open problem. On the other hand, state-of-the-art spatio-temporal forecasting models assume unfettered access to the data, neglecting constraints on data sharing. To bridge this gap, we propose a federated spatio-temporal model -- Cross-Node Federated Graph Neural Network (CNFGNN) -- which explicitly encodes the underlying graph structure using graph neural network (GNN)-based architecture under the constraint of cross-node federated learning, which requires that data in a network of nodes is generated locally on each node and remains decentralized. CNFGNN operates by disentangling the temporal dynamics modeling on devices and spatial dynamics on the server, utilizing alternating optimization to reduce the communication cost, facilitating computations on the edge devices. Experiments on the traffic flow forecasting task show that CNFGNN achieves the best forecasting performance in both transductive and inductive learning settings with no extra computation cost on edge devices, while incurring modest communication cost.

Performer · Extensibility · 聯邦學習 · 相似度 · 成對型 ·

2021 年 1 月 7 日

Personalized Cross-Silo Federated Learning on Non-IID Data

Yutao Huang,Lingyang Chu,Zirui Zhou,Lanjun Wang,Jiangchuan Liu,Jian Pei,Yong Zhang

from arxiv, Accepted by AAAI 2021. The API of this work is available at Huawei Cloud (//t.ly/nGN9), free registration is required before use

Non-IID data present a tough challenge for federated learning. In this paper, we explore a novel idea of facilitating pairwise collaborations between clients with similar data. We propose FedAMP, a new method employing federated attentive message passing to facilitate similar clients to collaborate more. We establish the convergence of FedAMP for both convex and non-convex models, and propose a heuristic method to further improve the performance of FedAMP when clients adopt deep neural networks as personalized models. Our extensive experiments on benchmark data sets demonstrate the superior performance of the proposed methods.

穩健性 · MoDELS · Continuity · Taxonomy · 聯邦學習 ·

2020 年 12 月 7 日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Lingjuan Lyu,Han Yu,Xingjun Ma,Lichao Sun,Jun Zhao,Qiang Yang,Philip S. Yu

from arxiv, arXiv admin note: text overlap with arXiv:2003.02133; text overlap with arXiv:1911.11815 by other authors

As data are increasingly being stored in different silos and societies becoming more aware of data privacy issues, the traditional centralized training of artificial intelligence (AI) models is facing efficiency and privacy challenges. Recently, federated learning (FL) has emerged as an alternative solution and continue to thrive in this new reality. Existing FL protocol design has been shown to be vulnerable to adversaries within or outside of the system, compromising data privacy and system robustness. Besides training powerful global models, it is of paramount importance to design FL systems that have privacy guarantees and are resistant to different types of adversaries. In this paper, we conduct the first comprehensive survey on this topic. Through a concise introduction to the concept of FL, and a unique taxonomy covering: 1) threat models; 2) poisoning attacks and defenses against robustness; 3) inference attacks and defenses against privacy, we provide an accessible review of this important topic. We highlight the intuitions, key techniques as well as fundamental assumptions adopted by various attacks and defenses. Finally, we discuss promising future research directions towards robust and privacy-preserving federated learning.