露脸视频一区二区三区在线播放,波多野结衣久久精品99E,国产在线播放剧情演绎,99久久九九社区精品

We consider the problem where $n$ clients transmit $d$-dimensional real-valued vectors using $d(1+o(1))$ bits each, in a manner that allows the receiver to approximately reconstruct their mean. Such compression problems naturally arise in distributed and federated learning. We provide novel mathematical results and derive computationally efficient algorithms that are more accurate than previous compression techniques. We evaluate our methods on a collection of distributed and federated learning tasks, using a variety of datasets, and show a consistent improvement over the state of the art.

相關內容

估計/估計量

關注 3

估計/估計量 · 向量化 · 似然 · 聯系函數 · 規范化的 ·

2022 年 1 月 11 日

Mixed-type multivariate response regression with covariance estimation

Karl Oskar Ekvall,Aaron J. Molstad

We propose a new method for multivariate response regression and covariance estimation when elements of the response vector are of mixed types, for example some continuous and some discrete. Our method is based on a model which assumes the observable mixed-type response vector is connected to a latent multivariate normal response linear regression through a link function. We explore the properties of this model and show its parameters are identifiable under reasonable conditions. We impose no parametric restrictions on the covariance of the latent normal other than positive definiteness, thereby avoiding assumptions about unobservable variables which can be difficult to verify in practice. To accommodate this generality, we propose a novel algorithm for approximate maximum likelihood estimation that works "off-the-shelf" with many different combinations of response types, and which scales well in the dimension of the response vector. Our method typically gives better predictions and parameter estimates than fitting separate models for the different response types and allows for approximate likelihood ratio testing of relevant hypotheses such as independence of responses. The usefulness of the proposed method is illustrated in simulations; and one biomedical and one genomic data example.

SAC · 近似 · 模型評估 · DC · Better ·

2022 年 1 月 10 日

Successive Approximation Coding for Distributed Matrix Multiplication

Shahrzad Kiani,Stark C. Draper

Coded distributed computing was recently introduced to mitigate the effect of stragglers on distributed computing. This paper combines ideas of approximate computing with coded computing to further accelerate computation. We propose successive approximation coding (SAC) techniques that realize a tradeoff between accuracy and speed, allowing the distributed computing system to produce approximations that increase in accuracy over time. If a sufficient number of compute nodes finish their tasks, SAC exactly recovers the desired computation. We theoretically provide design guidelines for our SAC techniques, and numerically show that SAC achieves a better accuracy-speed tradeoff in comparison with previous methods.

團 · CASE · state-of-the-art · FAST · 分離的 ·

2022 年 1 月 9 日

Quantum Distributed Algorithms for Detection of Cliques

Keren Censor-Hillel,Orr Fischer,Fran?ois Le Gall,Dean Leitersdorf,Rotem Oshman

from arxiv, Accepted to ITCS22

The possibilities offered by quantum computing have drawn attention in the distributed computing community recently, with several breakthrough results showing quantum distributed algorithms that run faster than the fastest known classical counterparts, and even separations between the two models. A prime example is the result by Izumi, Le Gall, and Magniez [STACS 2020], who showed that triangle detection by quantum distributed algorithms is easier than triangle listing, while an analogous result is not known in the classical case. In this paper we present a framework for fast quantum distributed clique detection. This improves upon the state-of-the-art for the triangle case, and is also more general, applying to larger clique sizes. Our main technical contribution is a new approach for detecting cliques by encapsulating this as a search task for nodes that can be added to smaller cliques. To extract the best complexities out of our approach, we develop a framework for nested distributed quantum searches, which employ checking procedures that are quantum themselves. Moreover, we show a circuit-complexity barrier on proving a lower bound of the form $\Omega(n^{3/5+\epsilon})$ for $K_p$-detection for any $p \geq 4$, even in the classical (non-quantum) distributed CONGEST setting.

平滑 · 代價函數 · 非凸 · Networking · 泛函 ·

2022 年 1 月 9 日

Distributed Random Reshuffling over Networks

Kun Huang,Xiao Li,Andre Milzarek,Shi Pu,Junwen Qiu

from arxiv, 27 pages, 5 figures

In this paper, we consider the distributed optimization problem where $n$ agents, each possessing a local cost function, collaboratively minimize the average of the local cost functions over a connected network. To solve the problem, we propose a distributed random reshuffling (D-RR) algorithm that combines the classical distributed gradient descent (DGD) method and Random Reshuffling (RR). We show that D-RR inherits the superiority of RR for both smooth strongly convex and smooth nonconvex objective functions. In particular, for smooth strongly convex objective functions, D-RR achieves $\mathcal{O}(1/T^2)$ rate of convergence (here, $T$ counts the total number of iterations) in terms of the squared distance between the iterate and the unique minimizer. When the objective function is assumed to be smooth nonconvex and has Lipschitz continuous component functions, we show that D-RR drives the squared norm of gradient to $0$ at a rate of $\mathcal{O}(1/T^{2/3})$. These convergence results match those of centralized RR (up to constant factors).

估計/估計量 · 隨機梯度下降 · 假設空間 · 泛函 · 情景 ·

2022 年 1 月 6 日

A Sieve Stochastic Gradient Descent Estimator for Online Nonparametric Regression in Sobolev ellipsoids

Tianyu Zhang,Noah Simon

The goal of regression is to recover an unknown underlying function that best links a set of predictors to an outcome from noisy observations. In nonparametric regression, one assumes that the regression function belongs to a pre-specified infinite-dimensional function space (the hypothesis space). In the online setting, when the observations come in a stream, it is computationally-preferable to iteratively update an estimate rather than refitting an entire model repeatedly. Inspired by nonparametric sieve estimation and stochastic approximation methods, we propose a sieve stochastic gradient descent estimator (Sieve-SGD) when the hypothesis space is a Sobolev ellipsoid. We show that Sieve-SGD has rate-optimal mean squared error (MSE) under a set of simple and direct conditions. The proposed estimator can be constructed with a low computational (time and space) expense: We also formally show that Sieve-SGD requires almost minimal memory usage among all statistically rate-optimal estimators.

穩健性 · state-of-the-art · Extensibility · 學成 · CIFAR-10 ·

2022 年 1 月 5 日

Aspis: A Robust Detection System for Distributed Learning

Konstantinos Konstantinidis,Aditya Ramamoorthy

from arxiv, 17 pages, 23 figures

State-of-the-art machine learning models are routinely trained on large-scale distributed clusters. Crucially, such systems can be compromised when some of the computing devices exhibit abnormal (Byzantine) behavior and return arbitrary results to the parameter server (PS). This behavior may be attributed to a plethora of reasons, including system failures and orchestrated attacks. Existing work suggests robust aggregation and/or computational redundancy to alleviate the effect of distorted gradients. However, most of these schemes are ineffective when an adversary knows the task assignment and can choose the attacked workers judiciously to induce maximal damage. Our proposed method Aspis assigns gradient computations to worker nodes using a subset-based assignment which allows for multiple consistency checks on the behavior of a worker node. Examination of the calculated gradients and post-processing (clique-finding in an appropriately constructed graph) by the central node allows for efficient detection and subsequent exclusion of adversaries from the training process. We prove the Byzantine resilience and detection guarantees of Aspis under weak and strong attacks and extensively evaluate the system on various large-scale training scenarios. The principal metric for our experiments is the test accuracy, for which we demonstrate a significant improvement of about 30% compared to many state-of-the-art approaches on the CIFAR-10 dataset. The corresponding reduction of the fraction of corrupted gradients ranges from 16% to 99%.

估計/估計量 · 對數似然 · 極大似然估計 · 似然 · 再參數化/重參數化 ·

2021 年 7 月 26 日

Human Pose Regression with Residual Log-likelihood Estimation

Jiefeng Li,Siyuan Bian,Ailing Zeng,Can Wang,Bo Pang,Wentao Liu,Cewu Lu

from arxiv, ICCV 2021 Oral

Heatmap-based methods dominate in the field of human pose estimation by modelling the output distribution through likelihood heatmaps. In contrast, regression-based methods are more efficient but suffer from inferior performance. In this work, we explore maximum likelihood estimation (MLE) to develop an efficient and effective regression-based methods. From the perspective of MLE, adopting different regression losses is making different assumptions about the output density function. A density function closer to the true distribution leads to a better regression performance. In light of this, we propose a novel regression paradigm with Residual Log-likelihood Estimation (RLE) to capture the underlying output distribution. Concretely, RLE learns the change of the distribution instead of the unreferenced underlying distribution to facilitate the training process. With the proposed reparameterization design, our method is compatible with off-the-shelf flow models. The proposed method is effective, efficient and flexible. We show its potential in various human pose estimation tasks with comprehensive experiments. Compared to the conventional regression paradigm, regression with RLE bring 12.4 mAP improvement on MSCOCO without any test-time overhead. Moreover, for the first time, especially on multi-person pose estimation, our regression method is superior to the heatmap-based methods. Our code is available at //github.com/Jeff-sjtu/res-loglikelihood-regression

簇 · 圖 · 圖卷積 · Performer · 卷積 ·

2019 年 6 月 4 日

Attributed Graph Clustering via Adaptive Graph Convolution

Xiaotong Zhang,Han Liu,Qimai Li,Xiao-Ming Wu

from arxiv, IJCAI 2019

Attributed graph clustering is challenging as it requires joint modelling of graph structures and node attributes. Recent progress on graph convolutional networks has proved that graph convolution is effective in combining structural and content information, and several recent methods based on it have achieved promising clustering performance on some real attributed networks. However, there is limited understanding of how graph convolution affects clustering performance and how to properly use it to optimize performance for different graphs. Existing methods essentially use graph convolution of a fixed and low order that only takes into account neighbours within a few hops of each node, which underutilizes node relations and ignores the diversity of graphs. In this paper, we propose an adaptive graph convolution method for attributed graph clustering that exploits high-order graph convolution to capture global cluster structure and adaptively selects the appropriate order for different graphs. We establish the validity of our method by theoretical analysis and extensive experiments on benchmark datasets. Empirical results show that our method compares favourably with state-of-the-art methods.

優化器 · Lipschitz連續 · 正則化項 · Continuity · Lipschitz ·

2018 年 6 月 1 日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Kevin Scaman,Francis Bach,Sébastien Bubeck,Yin Tat Lee,Laurent Massoulié

from arxiv, 17 pages

In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.

優化器 · Extensibility · 對偶問題 · 平滑 · INTERACT ·

2017 年 12 月 1 日

Optimal Algorithms for Distributed Optimization

César A. Uribe,Soomin Lee,Alexander Gasnikov,Angelia Nedi?

In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.