亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In this article, we propose a new approach, optimize then agree for minimizing a sum $ f = \sum_{i=1}^n f_i(x)$ of convex objective functions over a directed graph. The optimize then agree approach decouples the optimization step and the consensus step in a distributed optimization framework. The key motivation for optimize then agree is to guarantee that the disagreement between the estimates of the agents during every iteration of the distributed optimization algorithm remains under any apriori specified tolerance; existing algorithms do not provide such a guarantee which is required in many practical scenarios. In this method, each agent during each iteration maintains an estimate of the optimal solution and, utilizes its locally available gradient information along with a finite-time approximate consensus protocol to move towards the optimal solution (hence the name Gradient-Consensus algorithm). We establish that the proposed algorithm has a global R-linear rate of convergence if the aggregate function $f$ is strongly convex and Lipschitz differentiable. We also show that under the relaxed assumption of $f_i$'s being convex and Lipschitz differentiable, the objective function error residual decreases at a Q-linear rate (in terms of the number of gradient computation steps) until it reaches a small value, which can be managed using the tolerance value specified on the finite-time approximate consensus protocol; no existing method in the literature has such strong convergence guarantees when $f_i$ are not necessarily strongly convex functions. The communication overhead for the improved guarantees on meeting constraints and better convergence of our algorithm is $O(k\log k)$ iterates in comparison to $O(k)$ of the traditional algorithms. Further, we numerically evaluate the performance of the proposed algorithm by solving a distributed logistic regression problem.

相關內容

We consider the problem of positive-semidefinite continuation: extending a partially specified covariance kernel from a subdomain $\Omega$ of a domain $I\times I$ to a covariance kernel on the entire domain $I\times I$. For a broad class of domains $\Omega$ called serrated domains, we are able to present a complete theory. Namely, we demonstrate that a canonical completion always exists and can be explicitly constructed. We characterise all possible completions as suitable perturbations of the canonical completion, and determine necessary and sufficient conditions for a unique completion to exist. We interpret the canonical completion via the graphical model structure it induces on the associated Gaussian process. Furthermore, we show how the estimation of the canonical completion reduces to the solution of a system of linear statistical inverse problems in the space of Hilbert-Schmidt operators, and derive rates of convergence under standard source conditions. We conclude by providing extensions of our theory to more general forms of domains.

This paper is concerned with the statistical analysis of matrix-valued time series. These are data collected over a network of sensors (typically a set of spatial locations), recording, over time, observations of multiple measurements. From such data, we propose to learn, in an online fashion, a graph that captures two aspects of dependency: one describing the sparse spatial relationship between sensors, and the other characterizing the measurement relationship. To this purpose, we introduce a novel multivariate autoregressive model to infer the graph topology encoded in the coefficient matrix which captures the sparse Granger causality dependency structure present in such matrix-valued time series. We decompose the graph by imposing a Kronecker sum structure on the coefficient matrix. We develop two online approaches to learn the graph in a recursive way. The first one uses Wald test for the projected OLS estimation, where we derive the asymptotic distribution for the estimator. For the second one, we formalize a Lasso-type optimization problem. We rely on homotopy algorithms to derive updating rules for estimating the coefficient matrix. Furthermore, we provide an adaptive tuning procedure for the regularization parameter. Numerical experiments using both synthetic and real data, are performed to support the effectiveness of the proposed learning approaches.

Federated learning enables a large number of clients to participate in learning a shared model while maintaining the training data stored in each client, which protects data privacy and security. Till now, federated learning frameworks are built in a centralized way, in which a central client is needed for collecting and distributing information from every other client. This not only leads to high communication pressure at the central client, but also renders the central client highly vulnerable to failure and attack. Here we propose a principled decentralized federated learning algorithm (DeFed), which removes the central client in the classical Federated Averaging (FedAvg) setting and only relies information transmission between clients and their local neighbors. The proposed DeFed algorithm is proven to reach the global minimum with a convergence rate of $O(1/T)$ when the loss function is smooth and strongly convex, where $T$ is the number of iterations in gradient descent. Finally, the proposed algorithm has been applied to a number of toy examples to demonstrate its effectiveness.

This paper presents universal algorithms for clustering problems, including the widely studied $k$-median, $k$-means, and $k$-center objectives. The input is a metric space containing all potential client locations. The algorithm must select $k$ cluster centers such that they are a good solution for any subset of clients that actually realize. Specifically, we aim for low regret, defined as the maximum over all subsets of the difference between the cost of the algorithm's solution and that of an optimal solution. A universal algorithm's solution $SOL$ for a clustering problem is said to be an $(\alpha, \beta)$-approximation if for all subsets of clients $C'$, it satisfies $SOL(C') \leq \alpha \cdot OPT(C') + \beta \cdot MR$, where $OPT(C')$ is the cost of the optimal solution for clients $C'$ and $MR$ is the minimum regret achievable by any solution. Our main results are universal algorithms for the standard clustering objectives of $k$-median, $k$-means, and $k$-center that achieve $(O(1), O(1))$-approximations. These results are obtained via a novel framework for universal algorithms using linear programming (LP) relaxations. These results generalize to other $\ell_p$-objectives and the setting where some subset of the clients are fixed. We also give hardness results showing that $(\alpha, \beta)$-approximation is NP-hard if $\alpha$ or $\beta$ is at most a certain constant, even for the widely studied special case of Euclidean metric spaces. This shows that in some sense, $(O(1), O(1))$-approximation is the strongest type of guarantee obtainable for universal clustering.

We propose policy-gradient algorithms for solving the problem of control in a risk-sensitive reinforcement learning (RL) context. The objective of our algorithm is to maximize the distorted risk measure (DRM) of the cumulative reward in an episodic Markov decision process (MDP). We derive a variant of the policy gradient theorem that caters to the DRM objective. Using this theorem in conjunction with a likelihood ratio (LR) based gradient estimation scheme, we propose policy gradient algorithms for optimizing DRM in both on-policy and off-policy RL settings. We derive non-asymptotic bounds that establish the convergence of our algorithms to an approximate stationary point of the DRM objective.

Reliable, risk-averse design of complex engineering systems with optimized performance requires dealing with uncertainties. A conventional approach is to add safety margins to a design that was obtained from deterministic optimization. Safer engineering designs require appropriate cost and constraint function definitions that capture the \textit{risk} associated with unwanted system behavior in the presence of uncertainties. The paper proposes two notions of certifiability. The first is based on accounting for the magnitude of failure to ensure data-informed conservativeness. The second is the ability to provide optimization convergence guarantees by preserving convexity. Satisfying these notions leads to \textit{certifiable} risk-based design optimization (CRiBDO). In the context of CRiBDO, risk measures based on superquantile (a.k.a.\ conditional value-at-risk) and buffered probability of failure are analyzed. CRiBDO is contrasted with reliability-based design optimization (RBDO), where uncertainties are accounted for via the probability of failure, through a structural and a thermal design problem. A reformulation of the short column structural design problem leading to a convex CRiBDO problem is presented. The CRiBDO formulations capture more information about the problem to assign the appropriate conservativeness, exhibit superior optimization convergence by preserving properties of underlying functions, and alleviate the adverse effects of choosing hard failure thresholds required in RBDO.

Contextual multi-armed bandit (MAB) achieves cutting-edge performance on a variety of problems. When it comes to real-world scenarios such as recommendation system and online advertising, however, it is essential to consider the resource consumption of exploration. In practice, there is typically non-zero cost associated with executing a recommendation (arm) in the environment, and hence, the policy should be learned with a fixed exploration cost constraint. It is challenging to learn a global optimal policy directly, since it is a NP-hard problem and significantly complicates the exploration and exploitation trade-off of bandit algorithms. Existing approaches focus on solving the problems by adopting the greedy policy which estimates the expected rewards and costs and uses a greedy selection based on each arm's expected reward/cost ratio using historical observation until the exploration resource is exhausted. However, existing methods are hard to extend to infinite time horizon, since the learning process will be terminated when there is no more resource. In this paper, we propose a hierarchical adaptive contextual bandit method (HATCH) to conduct the policy learning of contextual bandits with a budget constraint. HATCH adopts an adaptive method to allocate the exploration resource based on the remaining resource/time and the estimation of reward distribution among different user contexts. In addition, we utilize full of contextual feature information to find the best personalized recommendation. Finally, in order to prove the theoretical guarantee, we present a regret bound analysis and prove that HATCH achieves a regret bound as low as $O(\sqrt{T})$. The experimental results demonstrate the effectiveness and efficiency of the proposed method on both synthetic data sets and the real-world applications.

Federated learning is a new distributed machine learning framework, where a bunch of heterogeneous clients collaboratively train a model without sharing training data. In this work, we consider a practical and ubiquitous issue in federated learning: intermittent client availability, where the set of eligible clients may change during the training process. Such an intermittent client availability model would significantly deteriorate the performance of the classical Federated Averaging algorithm (FedAvg for short). We propose a simple distributed non-convex optimization algorithm, called Federated Latest Averaging (FedLaAvg for short), which leverages the latest gradients of all clients, even when the clients are not available, to jointly update the global model in each iteration. Our theoretical analysis shows that FedLaAvg attains the convergence rate of $O(1/(N^{1/4} T^{1/2}))$, achieving a sublinear speedup with respect to the total number of clients. We implement and evaluate FedLaAvg with the CIFAR-10 dataset. The evaluation results demonstrate that FedLaAvg indeed reaches a sublinear speedup and achieves 4.23% higher test accuracy than FedAvg.

In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.

We propose a fully distributed actor-critic algorithm approximated by deep neural networks, named \textit{Diff-DAC}, with application to single-task and to average multitask reinforcement learning (MRL). Each agent has access to data from its local task only, but it aims to learn a policy that performs well on average for the whole set of tasks. During the learning process, agents communicate their value-policy parameters to their neighbors, diffusing the information across the network, so that they converge to a common policy, with no need for a central node. The method is scalable, since the computational and communication costs per agent grow with its number of neighbors. We derive Diff-DAC's from duality theory and provide novel insights into the standard actor-critic framework, showing that it is actually an instance of the dual ascent method that approximates the solution of a linear program. Experiments suggest that Diff-DAC can outperform the single previous distributed MRL approach (i.e., Dist-MTLPS) and even the centralized architecture.

北京阿比特科技有限公司