日韩在线精品小视频,一级A婬片试看28分钟,亚洲欧美日韩中文字幕二区

Due to the communication bottleneck in distributed and federated learning applications, algorithms using communication compression have attracted significant attention and are widely used in practice. Moreover, the huge number, high heterogeneity and limited availability of clients result in high client-variance. This paper addresses these two issues together by proposing compressed and client-variance reduced methods COFIG and FRECON. We prove an $O(\frac{(1+\omega)^{3/2}\sqrt{N}}{S\epsilon^2}+\frac{(1+\omega)N^{2/3}}{S\epsilon^2})$ bound on the number of communication rounds of COFIG in the nonconvex setting, where $N$ is the total number of clients, $S$ is the number of clients participating in each round, $\epsilon$ is the convergence error, and $\omega$ is the variance parameter associated with the compression operator. In case of FRECON, we prove an $O(\frac{(1+\omega)\sqrt{N}}{S\epsilon^2})$ bound on the number of communication rounds. In the convex setting, COFIG converges within $O(\frac{(1+\omega)\sqrt{N}}{S\epsilon})$ communication rounds, which is also the first convergence result for compression schemes that do not communicate with all the clients in each round. We stress that neither COFIG nor FRECON needs to communicate with all the clients, and they enjoy the first or faster convergence results for convex and nonconvex federated learning in the regimes considered. Experimental results point to an empirical superiority of COFIG and FRECON over existing baselines.

相關內容

聯邦學習

關注 199

聯邦學習（Federated Learning）是一種新興的人工智能基礎技術，在 2016 年由谷歌最先提出，原本用于解決安卓手機終端用戶在本地更新模型的問題，其設計目標是在保障大數據交換時的信息安全、保護終端數據和個人數據隱私、保證合法合規的前提下，在多參與方或多計算結點之間開展高效率的機器學習。其中，聯邦學習可使用的機器學習算法不局限于神經網絡，還包括隨機森林等重要算法。聯邦學習有望成為下一代人工智能協同算法和協作網絡的基礎。

估計/估計量 · 樣本 · 統計量 · Oracle · INTERACT ·

2022 年 4 月 19 日

Making Progress Based on False Discoveries

Roi Livni

We consider the question of adaptive data analysis within the framework of convex optimization. We ask how many samples are needed in order to compute $\epsilon$-accurate estimates of $O(1/\epsilon^2)$ gradients queried by gradient descent, and we provide two intermediate answers to this question. First, we show that for a general analyst (not necessarily gradient descent) $\Omega(1/\epsilon^3)$ samples are required. This rules out the possibility of a foolproof mechanism. Our construction builds upon a new lower bound (that may be of interest of its own right) for an analyst that may ask several non adaptive questions in a batch of fixed and known $T$ rounds of adaptivity and requires a fraction of true discoveries. We show that for such an analyst $\Omega (\sqrt{T}/\epsilon^2)$ samples are necessary. Second, we show that, under certain assumptions on the oracle, in an interaction with gradient descent $\tilde \Omega(1/\epsilon^{2.5})$ samples are necessary. Our assumptions are that the oracle has only \emph{first order access} and is \emph{post-hoc generalizing}. First order access means that it can only compute the gradients of the sampled function at points queried by the algorithm. Our assumption of \emph{post-hoc generalization} follows from existing lower bounds for statistical queries. More generally then, we provide a generic reduction from the standard setting of statistical queries to the problem of estimating gradients queried by gradient descent. These results are in contrast with classical bounds that show that with $O(1/\epsilon^2)$ samples one can optimize the population risk to accuracy of $O(\epsilon)$ but, as it turns out, with spurious gradients.

可約的 · DNN · Performer · 規范化的 · 模型評估 ·

2022 年 4 月 18 日

How to Attain Communication-Efficient DNN Training? Convert, Compress, Correct

Zhong-Jing Chen,Eduin E. Hernandez,Yu-Chih Huang,Stefano Rini

from arxiv, arXiv admin note: substantial text overlap with arXiv:2203.09044

In this paper, we introduce $\mathsf{CO}_3$, an algorithm for communication-efficiency federated Deep Neural Network (DNN) training.$\mathsf{CO}_3$ takes its name from three processing applied steps which reduce the communication load when transmitting the local gradients from the remote users to the Parameter Server.Namely:(i) gradient quantization through floating-point conversion, (ii) lossless compression of the quantized gradient, and (iii) quantization error correction.We carefully design each of the steps above so as to minimize the loss in the distributed DNN training when the communication overhead is fixed.In particular, in the design of steps (i) and (ii), we adopt the assumption that DNN gradients are distributed according to a generalized normal distribution.This assumption is validated numerically in the paper. For step (iii), we utilize an error feedback with memory decay mechanism to correct the quantization error introduced in step (i). We argue that this coefficient, similarly to the learning rate, can be optimally tuned to improve convergence. The performance of $\mathsf{CO}_3$ is validated through numerical simulations and is shown having better accuracy and improved stability at a reduced communication payload.

可約的 · 學成 · 代價 · MoDELS · 聯邦學習 ·

2022 年 4 月 17 日

Federated Learning Cost Disparity for IoT Devices

Sheeraz A. Alvi,Yi Hong,Salman Durrani

from arxiv, arXiv admin note: substantial text overlap with arXiv:2109.05267

Federated learning (FL) promotes predictive model training at the Internet of things (IoT) devices by evading data collection cost in terms of energy, time, and privacy. We model the learning gain achieved by an IoT device against its participation cost as its utility. Due to the device-heterogeneity, the local model learning cost and its quality, which can be time-varying, differs from device to device. We show that this variation results in utility unfairness because the same global model is shared among the devices. By default, the master is unaware of the local model computation and transmission costs of the devices, thus it is unable to address the utility unfairness problem. Also, a device may exploit this lack of knowledge at the master to intentionally reduce its expenditure and thereby enhance its utility. We propose to control the quality of the global model shared with the devices, in each round, based on their contribution and expenditure. This is achieved by employing differential privacy to curtail global model divulgence based on the learning contribution. In addition, we devise adaptive computation and transmission policies for each device to control its expenditure in order to mitigate utility unfairness. Our results show that the proposed scheme reduces the standard deviation of the energy cost of devices by 99% in comparison to the benchmark scheme, while the standard deviation of the training loss of devices varies around 0.103.

學成 · 聯邦學習 · Performer · 約束 · Extensibility ·

2022 年 4 月 17 日

Quantized Federated Learning under Transmission Delay and Outage Constraints

Yanmeng Wang,Yanqing Xu,Qingjiang Shi,Tsung-Hui Chang

Federated learning (FL) has been recognized as a viable distributed learning paradigm which trains a machine learning model collaboratively with massive mobile devices in the wireless edge while protecting user privacy. Although various communication schemes have been proposed to expedite the FL process, most of them have assumed ideal wireless channels which provide reliable and lossless communication links between the server and mobile clients. Unfortunately, in practical systems with limited radio resources such as constraint on the training latency and constraints on the transmission power and bandwidth, transmission of a large number of model parameters inevitably suffers from quantization errors (QE) and transmission outage (TO). In this paper, we consider such non-ideal wireless channels, and carry out the first analysis showing that the FL convergence can be severely jeopardized by TO and QE, but intriguingly can be alleviated if the clients have uniform outage probabilities. These insightful results motivate us to propose a robust FL scheme, named FedTOE, which performs joint allocation of wireless resources and quantization bits across the clients to minimize the QE while making the clients have the same TO probability. Extensive experimental results are presented to show the superior performance of FedTOE for deep learning-based classification tasks with transmission latency constraints.

Lipschitz · Lipschitz常數 · 閾值 · 聯邦學習 · 規范化的 ·

2022 年 4 月 16 日

On the Convergence of Differentially Private Federated Learning on Non-Lipschitz Objectives, and with Normalized Client Updates

Rudrajit Das,Abolfazl Hashemi,Sujay Sanghavi,Inderjit S. Dhillon

There is a dearth of convergence results for differentially private federated learning (FL) with non-Lipschitz objective functions (i.e., when gradient norms are not bounded). The primary reason for this is that the clipping operation (i.e., projection onto an $\ell_2$ ball of a fixed radius called the clipping threshold) for bounding the sensitivity of the average update to each client's update introduces bias depending on the clipping threshold and the number of local steps in FL, and analyzing this is not easy. For Lipschitz functions, the Lipschitz constant serves as a trivial clipping threshold with zero bias. However, Lipschitzness does not hold in many practical settings; moreover, verifying it and computing the Lipschitz constant is hard. Thus, the choice of the clipping threshold is non-trivial and requires a lot of tuning in practice. In this paper, we provide the first convergence result for private FL on smooth \textit{convex} objectives \textit{for a general clipping threshold} -- \textit{without assuming Lipschitzness}. We also look at a simpler alternative to clipping (for bounding sensitivity) which is \textit{normalization} -- where we use only a scaled version of the unit vector along the client updates, completely discarding the magnitude information. {The resulting normalization-based private FL algorithm is theoretically shown to have better convergence than its clipping-based counterpart on smooth convex functions. We corroborate our theory with synthetic experiments as well as experiments on benchmarking datasets.

向量化 · 重構誤差 · 壓縮感知 · MoDELS · 聯邦學習 ·

2022 年 4 月 16 日

FedVQCS: Federated Learning via Vector Quantized Compressed Sensing

Yongjeong Oh,Yo-Seb Jeon,Mingzhe Chen,Walid Saad

In this paper, a new communication-efficient federated learning (FL) framework is proposed, inspired by vector quantized compressed sensing. The basic strategy of the proposed framework is to compress the local model update at each device by applying dimensionality reduction followed by vector quantization. Subsequently, the global model update is reconstructed at a parameter server (PS) by applying a sparse signal recovery algorithm to the aggregation of the compressed local model updates. By harnessing the benefits of both dimensionality reduction and vector quantization, the proposed framework effectively reduces the communication overhead of local update transmissions. Both the design of the vector quantizer and the key parameters for the compression are optimized so as to minimize the reconstruction error of the global model update under the constraint of wireless link capacity. By considering the reconstruction error, the convergence rate of the proposed framework is also analyzed for a smooth loss function. Simulation results on the MNIST and CIFAR-10 datasets demonstrate that the proposed framework provides more than a 2.5% increase in classification accuracy compared to state-of-art FL frameworks when the communication overhead of the local model update transmission is less than 0.1 bit per local model entry.

可約的 · 服務器 · 邊 · Continuity · Performer ·

2022 年 4 月 15 日

Server Free Wireless Federated Learning: Architecture, Algorithm, and Analysis

Howard H. Yang,Zihan Chen,Tony Q. S. Quek

We demonstrate that merely analog transmissions and match filtering can realize the function of an edge server in federated learning (FL). Therefore, a network with massively distributed user equipments (UEs) can achieve large-scale FL without an edge server. We also develop a training algorithm that allows UEs to continuously perform local computing without being interrupted by the global parameter uploading, which exploits the full potential of UEs' processing power. We derive convergence rates for the proposed schemes to quantify their training efficiency. The analyses reveal that when the interference obeys a Gaussian distribution, the proposed algorithm retrieves the convergence rate of a server-based FL. But if the interference distribution is heavy-tailed, then the heavier the tail, the slower the algorithm converges. Nonetheless, the system run time can be largely reduced by enabling computation in parallel with communication, whereas the gain is particularly pronounced when communication latency is high. These findings are corroborated via excessive simulations.

CC · Sphering · 近似 · 解碼 · 可約的 ·

2022 年 4 月 15 日

Deep Learning-based List Sphere Decoding for Faster-than-Nyquist (FTN) Signaling Detection

Sina Abbasi,Ebrahim Bedeer

from arxiv, Accepted to IEEE VTC-Spring 2022

Faster-than-Nyquist (FTN) signaling is a candidate non-orthonormal transmission technique to improve the spectral efficiency (SE) of future communication systems. However, such improvements of the SE are at the cost of additional computational complexity to remove the intentionally introduced intersymbol interference. In this paper, we investigate the use of deep learning (DL) to reduce the detection complexity of FTN signaling. To eliminate the need of having a noise whitening filter at the receiver, we first present an equivalent FTN signaling model based on using a set of orthonormal basis functions and identify its operation region. Second, we propose a DL-based list sphere decoding (DL-LSD) algorithm that selects and updates the initial radius of the original LSD to guarantee a pre-defined number $N_{\text{L}}$ of lattice points inside the hypersphere. This is achieved by training a neural network to output an approximate initial radius that includes $N_{\text{L}}$ lattice points. At the testing phase, if the hypersphere has more than $N_{\text{L}}$ lattice points, we keep the $N_{\text{L}}$ closest points to the point corresponding to the received FTN signal; however, if the hypersphere has less than $N_{\text{L}}$ points, we increase the approximate initial radius by a value that depends on the standard deviation of the distribution of the output radii from the training phase. Then, the approximate value of the log-likelihood ratio (LLR) is calculated based on the obtained $N_{\text{L}}$ points. Simulation results show that the computational complexity of the proposed DL-LSD is lower than its counterpart of the original LSD by orders of magnitude.

優化器 · Performer · 學成 · 深度 Q 學習 · 強化學習 ·

2022 年 4 月 15 日

A Reinforcement Learning Approach to Parameter Selection for Distributed Optimal Power Flow

Sihan Zeng,Alyssa Kody,Youngdae Kim,Kibaek Kim,Daniel K. Molzahn

With the increasing penetration of distributed energy resources, distributed optimization algorithms have attracted significant attention for power systems applications due to their potential for superior scalability, privacy, and robustness to a single point-of-failure. The Alternating Direction Method of Multipliers (ADMM) is a popular distributed optimization algorithm; however, its convergence performance is highly dependent on the selection of penalty parameters, which are usually chosen heuristically. In this work, we use reinforcement learning (RL) to develop an adaptive penalty parameter selection policy for the AC optimal power flow (ACOPF) problem solved via ADMM with the goal of minimizing the number of iterations until convergence. We train our RL policy using deep Q-learning, and show that this policy can result in significantly accelerated convergence (up to a 59% reduction in the number of iterations compared to existing, curvature-informed penalty parameter selection methods). Furthermore, we show that our RL policy demonstrates promise for generalizability, performing well under unseen loading schemes as well as under unseen losses of lines and generators (up to a 50% reduction in iterations). This work thus provides a proof-of-concept for using RL for parameter selection in ADMM for power systems applications.

Vision · 模型評估 · 可約的 · 計算機視覺 · DNN ·

2020 年 3 月 24 日

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Abhinav Goel,Caleb Tung,Yung-Hsiang Lu,George K. Thiruvathukal

from arxiv, Accepted for publication at 2020 IEEE 6th World Forum on Internet of Things (WF-IoT), New Orleans, LA, USA 2020

Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.