人妻无码精品人妻,又大又黄又粗又色在线播放,国产AV网站更新

In this article, we propose a new approach, optimize then agree for minimizing a sum $ f = \sum_{i=1}^n f_i(x)$ of convex objective functions over a directed graph. The optimize then agree approach decouples the optimization step and the consensus step in a distributed optimization framework. The key motivation for optimize then agree is to guarantee that the disagreement between the estimates of the agents during every iteration of the distributed optimization algorithm remains under any apriori specified tolerance; existing algorithms do not provide such a guarantee which is required in many practical scenarios. In this method, each agent during each iteration maintains an estimate of the optimal solution and, utilizes its locally available gradient information along with a finite-time approximate consensus protocol to move towards the optimal solution (hence the name Gradient-Consensus algorithm). We establish that the proposed algorithm has a global R-linear rate of convergence if the aggregate function $f$ is strongly convex and Lipschitz differentiable. We also show that under the relaxed assumption of $f_i$'s being convex and Lipschitz differentiable, the objective function error residual decreases at a Q-linear rate (in terms of the number of gradient computation steps) until it reaches a small value, which can be managed using the tolerance value specified on the finite-time approximate consensus protocol; no existing method in the literature has such strong convergence guarantees when $f_i$ are not necessarily strongly convex functions. The communication overhead for the improved guarantees on meeting constraints and better convergence of our algorithm is $O(k\log k)$ iterates in comparison to $O(k)$ of the traditional algorithms. Further, we numerically evaluate the performance of the proposed algorithm by solving a distributed logistic regression problem.

相關內容

優化器

關注 4

正則的 · 核化 · Extensibility · 可約的 · 估計/估計量 ·

2021 年 7 月 19 日

The Completion of Covariance Kernels

Kartik G. Waghmare,Victor M. Panaretos

from arxiv, Typos corrected

We consider the problem of positive-semidefinite continuation: extending a partially specified covariance kernel from a subdomain $\Omega$ of a domain $I\times I$ to a covariance kernel on the entire domain $I\times I$. For a broad class of domains $\Omega$ called serrated domains, we are able to present a complete theory. Namely, we demonstrate that a canonical completion always exists and can be explicitly constructed. We characterise all possible completions as suitable perturbations of the canonical completion, and determine necessary and sufficient conditions for a unique completion to exist. We interpret the canonical completion via the graphical model structure it induces on the associated Gaussian process. Furthermore, we show how the estimation of the canonical completion reduces to the solution of a system of linear statistical inverse problems in the space of Hilbert-Schmidt operators, and derive rates of convergence under standard source conditions. We conclude by providing extensions of our theory to more general forms of domains.

圖 · 估計/估計量 · 學成 · 稀疏 · Performer ·

2021 年 7 月 16 日

Online Graph Topology Learning from Matrix-valued Time Series

Yiye Jiang,Jérémie Bigot,Sofian Maabout

This paper is concerned with the statistical analysis of matrix-valued time series. These are data collected over a network of sensors (typically a set of spatial locations), recording, over time, observations of multiple measurements. From such data, we propose to learn, in an online fashion, a graph that captures two aspects of dependency: one describing the sparse spatial relationship between sensors, and the other characterizing the measurement relationship. To this purpose, we introduce a novel multivariate autoregressive model to infer the graph topology encoded in the coefficient matrix which captures the sparse Granger causality dependency structure present in such matrix-valued time series. We decompose the graph by imposing a Kronecker sum structure on the coefficient matrix. We develop two online approaches to learn the graph in a recursive way. The first one uses Wald test for the projected OLS estimation, where we derive the asymptotic distribution for the estimator. For the second one, we formalize a Lasso-type optimization problem. We rely on homotopy algorithms to derive updating rules for estimating the coefficient matrix. Furthermore, we provide an adaptive tuning procedure for the regularization parameter. Numerical experiments using both synthetic and real data, are performed to support the effectiveness of the proposed learning approaches.

聯邦學習 · Principle · 學成 · INFORMS · 全局最小值 ·

2021 年 7 月 15 日

DeFed: A Principled Decentralized and Privacy-Preserving Federated Learning Algorithm

Ye Yuan,Ruijuan Chen,Chuan Sun,Maolin Wang,Feng Hua,Xinlei Yi,Tao Yang,Jun Liu

Federated learning enables a large number of clients to participate in learning a shared model while maintaining the training data stored in each client, which protects data privacy and security. Till now, federated learning frameworks are built in a centralized way, in which a central client is needed for collecting and distributing information from every other client. This not only leads to high communication pressure at the central client, but also renders the central client highly vulnerable to failure and attack. Here we propose a principled decentralized federated learning algorithm (DeFed), which removes the central client in the classical Federated Averaging (FedAvg) setting and only relies information transmission between clients and their local neighbors. The proposed DeFed algorithm is proven to reach the global minimum with a convergence rate of $O(1/T)$ when the loss function is smooth and strongly convex, where $T$ is the number of iterations in gradient descent. Finally, the proposed algorithm has been applied to a number of toy examples to demonstrate its effectiveness.

簇 · 磁流變材料 · 優化器 · 大學 · 極小點 ·

2021 年 7 月 14 日

Universal Algorithms for Clustering Problems

Arun Ganesh,Bruce M. Maggs,Debmalya Panigrahi

from arxiv, Appeared in ICALP 2021, Track A. Fixed mismatch between paper title and arXiv title

This paper presents universal algorithms for clustering problems, including the widely studied $k$-median, $k$-means, and $k$-center objectives. The input is a metric space containing all potential client locations. The algorithm must select $k$ cluster centers such that they are a good solution for any subset of clients that actually realize. Specifically, we aim for low regret, defined as the maximum over all subsets of the difference between the cost of the algorithm's solution and that of an optimal solution. A universal algorithm's solution $SOL$ for a clustering problem is said to be an $(\alpha, \beta)$-approximation if for all subsets of clients $C'$, it satisfies $SOL(C') \leq \alpha \cdot OPT(C') + \beta \cdot MR$, where $OPT(C')$ is the cost of the optimal solution for clients $C'$ and $MR$ is the minimum regret achievable by any solution. Our main results are universal algorithms for the standard clustering objectives of $k$-median, $k$-means, and $k$-center that achieve $(O(1), O(1))$-approximations. These results are obtained via a novel framework for universal algorithms using linear programming (LP) relaxations. These results generalize to other $\ell_p$-objectives and the setting where some subset of the clients are fixed. We also give hardness results showing that $(\alpha, \beta)$-approximation is NP-hard if $\alpha$ or $\beta$ is at most a certain constant, even for the widely studied special case of Euclidean metric spaces. This shows that in some sense, $(O(1), O(1))$-approximation is the strongest type of guarantee obtainable for universal clustering.

DRM · 似然 · 駐點 · 估計/估計量 · 平穩的 ·

2021 年 7 月 14 日

Likelihood ratio-based policy gradient methods for distorted risk measures: A non-asymptotic analysis

Nithia Vijayan,Prashanth L. A

We propose policy-gradient algorithms for solving the problem of control in a risk-sensitive reinforcement learning (RL) context. The objective of our algorithm is to maximize the distorted risk measure (DRM) of the cumulative reward in an episodic Markov decision process (MDP). We derive a variant of the policy gradient theorem that caters to the DRM objective. Using this theorem in conjunction with a likelihood ratio (LR) based gradient estimation scheme, we propose policy gradient algorithms for optimizing DRM in both on-policy and off-policy RL settings. We derive non-asymptotic bounds that establish the convergence of our algorithms to an approximate stationary point of the DRM objective.

優化器 · Engineering · 設計 · INFORMS · Performer ·

2021 年 7 月 13 日

Certifiable Risk-Based Engineering Design Optimization

Anirban Chaudhuri,Boris Kramer,Matthew Norton,Johannes O. Royset,Karen Willcox

Reliable, risk-averse design of complex engineering systems with optimized performance requires dealing with uncertainties. A conventional approach is to add safety margins to a design that was obtained from deterministic optimization. Safer engineering designs require appropriate cost and constraint function definitions that capture the \textit{risk} associated with unwanted system behavior in the presence of uncertainties. The paper proposes two notions of certifiability. The first is based on accounting for the magnitude of failure to ensure data-informed conservativeness. The second is the ability to provide optimization convergence guarantees by preserving convexity. Satisfying these notions leads to \textit{certifiable} risk-based design optimization (CRiBDO). In the context of CRiBDO, risk measures based on superquantile (a.k.a.\ conditional value-at-risk) and buffered probability of failure are analyzed. CRiBDO is contrasted with reliability-based design optimization (RBDO), where uncertainties are accounted for via the probability of failure, through a structural and a thermal design problem. A reformulation of the short column structural design problem leading to a convex CRiBDO problem is presented. The CRiBDO formulations capture more information about the problem to assign the appropriate conservativeness, exhibit superior optimization convergence by preserving properties of underlying functions, and alleviate the adverse effects of choosing hard failure thresholds required in RBDO.

上下文賭博機/上下文老虎機 · 賭博機/老虎機 · 估計/估計量 · 約束 · 學成 ·

2020 年 4 月 2 日

Hierarchical Adaptive Contextual Bandits for Resource Constraint based Recommendation

Mengyue Yang,Qingyang Li,Zhiwei Qin,Jieping Ye

from arxiv, Accepted for publication at WWW (The Web Conference) 2020

Contextual multi-armed bandit (MAB) achieves cutting-edge performance on a variety of problems. When it comes to real-world scenarios such as recommendation system and online advertising, however, it is essential to consider the resource consumption of exploration. In practice, there is typically non-zero cost associated with executing a recommendation (arm) in the environment, and hence, the policy should be learned with a fixed exploration cost constraint. It is challenging to learn a global optimal policy directly, since it is a NP-hard problem and significantly complicates the exploration and exploitation trade-off of bandit algorithms. Existing approaches focus on solving the problems by adopting the greedy policy which estimates the expected rewards and costs and uses a greedy selection based on each arm's expected reward/cost ratio using historical observation until the exploration resource is exhausted. However, existing methods are hard to extend to infinite time horizon, since the learning process will be terminated when there is no more resource. In this paper, we propose a hierarchical adaptive contextual bandit method (HATCH) to conduct the policy learning of contextual bandits with a budget constraint. HATCH adopts an adaptive method to allocate the exploration resource based on the remaining resource/time and the estimation of reward distribution among different user contexts. In addition, we utilize full of contextual feature information to find the best personalized recommendation. Finally, in order to prove the theoretical guarantee, we present a regret bound analysis and prove that HATCH achieves a regret bound as low as $O(\sqrt{T})$. The experimental results demonstrate the effectiveness and efficiency of the proposed method on both synthetic data sets and the real-world applications.

優化器 · MoDELS · 分布式機器學習 · Performer · CIFAR-10 ·

2020 年 2 月 18 日

Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client Availability

Yikai Yan,Chaoyue Niu,Yucheng Ding,Zhenzhe Zheng,Fan Wu,Guihai Chen,Shaojie Tang,Zhihua Wu

from arxiv, ICML 2020 Submission

Federated learning is a new distributed machine learning framework, where a bunch of heterogeneous clients collaboratively train a model without sharing training data. In this work, we consider a practical and ubiquitous issue in federated learning: intermittent client availability, where the set of eligible clients may change during the training process. Such an intermittent client availability model would significantly deteriorate the performance of the classical Federated Averaging algorithm (FedAvg for short). We propose a simple distributed non-convex optimization algorithm, called Federated Latest Averaging (FedLaAvg for short), which leverages the latest gradients of all clients, even when the clients are not available, to jointly update the global model in each iteration. Our theoretical analysis shows that FedLaAvg attains the convergence rate of $O(1/(N^{1/4} T^{1/2}))$, achieving a sublinear speedup with respect to the total number of clients. We implement and evaluate FedLaAvg with the CIFAR-10 dataset. The evaluation results demonstrate that FedLaAvg indeed reaches a sublinear speedup and achieves 4.23% higher test accuracy than FedAvg.

優化器 · Lipschitz連續 · 正則化項 · Continuity · Lipschitz ·

2018 年 6 月 1 日

Optimal Algorithms for Non-Smooth Distributed Optimization in Networks

Kevin Scaman,Francis Bach,Sébastien Bubeck,Yin Tat Lee,Laurent Massoulié

from arxiv, 17 pages

In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.

學成 · 強化學習 · 深度強化學習 · 近似 · INFORMS ·

2018 年 4 月 22 日

Diff-DAC: Distributed Actor-Critic for Average Multitask Deep Reinforcement Learning

Sergio Valcarcel Macua,Aleksi Tukiainen,Daniel García-Oca?a Hernández,David Baldazo,Enrique Munoz de Cote,Santiago Zazo

We propose a fully distributed actor-critic algorithm approximated by deep neural networks, named \textit{Diff-DAC}, with application to single-task and to average multitask reinforcement learning (MRL). Each agent has access to data from its local task only, but it aims to learn a policy that performs well on average for the whole set of tasks. During the learning process, agents communicate their value-policy parameters to their neighbors, diffusing the information across the network, so that they converge to a common policy, with no need for a central node. The method is scalable, since the computational and communication costs per agent grow with its number of neighbors. We derive Diff-DAC's from duality theory and provide novel insights into the standard actor-critic framework, showing that it is actually an instance of the dual ascent method that approximates the solution of a linear program. Experiments suggest that Diff-DAC can outperform the single previous distributed MRL approach (i.e., Dist-MTLPS) and even the centralized architecture.