亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<tfoot id='9l726'></tfoot>

<legend id='9l726'><style id='9l726'><dir id='9l726'><q id='9l726'></q></dir></style></legend>

<i id='9l726'><tr id='9l726'><dt id='9l726'><q id='9l726'><span id='9l726'><b id='9l726'><form id='9l726'><ins id='9l726'></ins><ul id='9l726'></ul><sub id='9l726'></sub></form><legend id='9l726'></legend><bdo id='9l726'><pre id='9l726'><center id='9l726'></center></pre></bdo></b><th id='9l726'></th></span></q></dt></tr></i><div id='9l726'><tfoot id='9l726'></tfoot><dl id='9l726'><fieldset id='9l726'></fieldset></dl></div>

·

相互獨立的 · Performer · Analysis · 約束 · Agent ·

2023 年 2 月 15 日

Trust Region Bounds for Decentralized PPO Under Non-stationarity

Mingfei Sun,Sam Devlin,Jacob Beck,Katja Hofmann,Shimon Whiteson

from arxiv, AAMAS 2023

We present trust region bounds for optimizing decentralized policies in cooperative Multi-Agent Reinforcement Learning (MARL), which holds even when the transition dynamics are non-stationary. This new analysis provides a theoretical understanding of the strong performance of two recent actor-critic methods for MARL, which both rely on independent ratios, i.e., computing probability ratios separately for each agent's policy. We show that, despite the non-stationarity that independent ratios cause, a monotonic improvement guarantee still arises as a result of enforcing the trust region constraint over all decentralized policies. We also show this trust region constraint can be effectively enforced in a principled way by bounding independent ratios based on the number of agents in training, providing a theoretical foundation for proximal ratio clipping. Finally, our empirical results support the hypothesis that the strong performance of IPPO and MAPPO is a direct result of enforcing such a trust region constraint via clipping in centralized training, and tuning the hyperparameters with regards to the number of agents, as predicted by our theoretical analysis.

相關內容

相互獨立的

相互獨立的(de)

漸近穩定 · 全局漸近穩定性 · 漸近穩定性 · 子系統 · 系統 ·

2023 年 4 月 6 日

A Compositional Approach to Certifying the Almost Global Asymptotic Stability of Cascade Systems

Jake Welde,Matthew D. Kvalheim,Vijay Kumar

from arxiv, This version extends the main result inductively to upper triangular systems with an arbitrary number of subsystems

In this work, we give sufficient conditions for the almost global asymptotic stability of a cascade in which the inner loop and the unforced outer loop are each almost globally asymptotically stable. Our qualitative approach relies on the absence of chain recurrence for non-equilibrium points of the unforced outer loop, the hyperbolicity of equilibria, and the precompactness of forward trajectories. The result is extended inductively to upper triangular systems with an arbitrary number of subsystems. We show that the required structure of the chain recurrent set can be readily verified, and describe two important classes of systems with this property. We also show that the precompactness requirement can be verified by growth rate conditions on the interconnection term coupling the subsystems. Our results stand in contrast to prior works that require either global asymptotic stability of the subsystems (impossible for smooth systems evolving on general manifolds), time scale separation between the subsystems, or strong disturbance robustness properties of the outer loop. The approach has clear applications in stability certification of cascaded controllers for systems evolving on manifolds.

松弛 · 離散 · 阻尼 · 序列 · 分析 ·

2023 年 4 月 4 日

A damped Ka?anov scheme for the numerical solution of a relaxed p(x)-Poisson equation

The focus of the present work is the (theoretical) approximation of a solution of the p(x)-Poisson equation. To devise an iterative solver with guaranteed convergence, we will consider a relaxation of the original problem in terms of a truncation of the nonlinearity from below and from above by using a pair of positive cut-off parameters. We will then verify that, for any such pair, a damped Ka\v{c}anov scheme generates a sequence converging to a solution of the relaxed equation. Subsequently, it will be shown that the solutions of the relaxed problems converge to the solution of the original problem in the discrete setting. Finally, the discrete solutions of the unrelaxed problem converge to the continuous solution. Our work will finally be rounded up with some numerical experiments that underline the analytical findings.

差分 · 差分隱私 · 算法 · 服務器 · 分析 ·

2023 年 4 月 4 日

Local Differential Privacy in Federated Optimization

Syed Eqbal Alam,Dhirendra Shukla,Shrisha Rao

Federated optimization, wherein several agents in a network collaborate with a central server to achieve optimal social cost over the network with no requirement for exchanging information among agents, has attracted significant interest from the research community. In this context, agents demand resources based on their local computation. Due to the exchange of optimization parameters such as states, constraints, or objective functions with a central server, an adversary may infer sensitive information of agents. We develop LDP-AIMD, a local differentially-private additive-increase and multiplicative-decrease (AIMD) algorithm, to allocate multiple divisible shared resources to agents in a network. The LDP-AIMD algorithm provides a differential privacy guarantee to agents in the network. No inter-agent communication is required; however, the central server keeps track of the aggregate consumption of resources. We present experimental results to check the efficacy of the algorithm. Moreover, we present empirical analyses for the trade-off between privacy and the efficiency of the algorithm.

剛度 · 基元 · 擴展卡爾曼濾波 · 擴展卡爾曼濾波器 · 微環 ·

2023 年 4 月 3 日

Differentiable Environment Primitives for Contact State Estimation

Kevin Haninger,Kangwagye Samuel,Filippo Rozzi,Sehoon Oh,Loris Roveda

from arxiv, Submitted ICRA23 Code available at //github.com/khaninger/full-state-estimator

In contact-rich manipulation, the robot dynamics are coupled with an environment that has application-specific dynamic properties (stiffness, inertia) and geometry (contact normal). Knowledge of these environmental parameters can improve control and monitoring, but they are often unobserved and may vary, either online or between task instances. Observers, such as the extended Kalman filter, can be used to estimate these parameters, but such model-based techniques can require too much engineering work to scale up to complex environments, such as multi-point contact. To accelerate environment modeling, we propose environment primitives: parameterized environment dynamics that can be connected in parallel and are expressed in an automatic differentiation framework. This simplifies offline gradient-based optimization to fit model parameters and linearization of the coupled dynamics for an observer. This method is implemented for stiffness contact models, allowing the fitting of contact geometry and stiffness offline or their online estimation by an extended Kalman filter. This method is applied to a collaborative robot, estimating external force, contact stiffness, and contact geometry from the motor position and current. The estimates of external force and stiffness are compared with a momentum observer and direct force measurements.

變分不等式 · 變分 · 最優算法 · 最優 · 算法 ·

2023 年 4 月 2 日

Optimal Algorithms for Decentralized Stochastic Variational Inequalities

Dmitry Kovalev,Aleksandr Beznosikov,Abdurakhmon Sadiev,Michael Persiianov,Peter Richtárik,Alexander Gasnikov

from arxiv, Appears in: Advances in Neural Information Processing Systems 35 (NeurIPS 2022). Minor modifications with respect to the NeurIPS version. 58 pages, 6 algorithms, 9 figures, 4 tables

Variational inequalities are a formalism that includes games, minimization, saddle point, and equilibrium problems as special cases. Methods for variational inequalities are therefore universal approaches for many applied tasks, including machine learning problems. This work concentrates on the decentralized setting, which is increasingly important but not well understood. In particular, we consider decentralized stochastic (sum-type) variational inequalities over fixed and time-varying networks. We present lower complexity bounds for both communication and local iterations and construct optimal algorithms that match these lower bounds. Our algorithms are the best among the available literature not only in the decentralized stochastic case, but also in the decentralized deterministic and non-distributed stochastic cases. Experimental results confirm the effectiveness of the presented algorithms.

變分不等式 · 變分 · GANs · 異質 · 分布式訓練 ·

2023 年 4 月 2 日

Decentralized Local Stochastic Extra-Gradient for Variational Inequalities

Aleksandr Beznosikov,Pavel Dvurechensky,Anastasia Koloskova,Valentin Samokhin,Sebastian U Stich,Alexander Gasnikov

from arxiv, Appears in: Advances in Neural Information Processing Systems 35 (NeurIPS 2022). Minor modifications with respect to the NeurIPS version. 43 pages, 1 algorithm, 6 figures, 2 tables

We consider distributed stochastic variational inequalities (VIs) on unbounded domains with the problem data that is heterogeneous (non-IID) and distributed across many devices. We make a very general assumption on the computational network that, in particular, covers the settings of fully decentralized calculations with time-varying networks and centralized topologies commonly used in Federated Learning. Moreover, multiple local updates on the workers can be made for reducing the communication frequency between the workers. We extend the stochastic extragradient method to this very general setting and theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone (when a Minty solution exists) settings. The provided rates explicitly exhibit the dependence on network characteristics (e.g., mixing time), iteration counter, data heterogeneity, variance, number of devices, and other standard parameters. As a special case, our method and analysis apply to distributed stochastic saddle-point problems (SPP), e.g., to the training of Deep Generative Adversarial Networks (GANs) for which decentralized training has been reported to be extremely challenging. In experiments for the decentralized training of GANs we demonstrate the effectiveness of our proposed approach.

自適應學習 · 梯度 · 學習率 · 平均場 · 聯邦學習 ·

2023 年 4 月 1 日

Fast Convergent Federated Learning with Aggregated Gradients

Wenhao Yuan,Xuehe Wang

from arxiv, 7 pages, 2 figures

Federated Learning (FL) is a novel machine learning framework, which enables multiple distributed devices cooperatively to train a shared model scheduled by a central server while protecting private data locally. However, the non-independent-and-identically-distributed (Non-IID) data samples and frequent communication across participants may significantly slow down the convergent rate and increase communication costs. To achieve fast convergence, we ameliorate the conventional local updating rule by introducing the aggregated gradients at each local update epoch, and propose an adaptive learning rate algorithm that further takes the deviation of local parameter and global parameter into consideration. The above adaptive learning rate design requires all clients' local information including the local parameters and gradients, which is challenging as there is no communication during the local update epochs. To obtain a decentralized adaptive learning rate for each client, we utilize the mean field approach by introducing two mean field terms to estimate the average local parameters and gradients respectively, which does not require the clients to exchange their local information with each other at each local epoch. Numerical results show that our proposed framework is superior to the state-of-art FL schemes in both model accuracy and convergent rate for IID and Non-IID datasets.

均衡 · 系統 · 自適應系統 · 近似 · 不變 ·

2023 年 4 月 1 日

Max-Plus Synchronization in Decentralized Trading Systems

Hans Riess,Michael Munger,Michael M. Zavlanos

We introduce a decentralized mechanism for pricing and exchanging alternatives constrained by transaction costs. We characterize the time-invariant solutions of a heat equation involving a (weighted) Tarski Laplacian operator, defined for max-plus matrix-weighted graphs, as approximate equilibria of the trading system. We study algebraic properties of the solution sets as well as convergence behavior of the dynamical system. We apply these tools to the ``economic problem'' of allocating scarce resources among competing uses. Our theory suggests differences in competitive equilibrium, bargaining, or cost-benefit analysis, depending on the context, are largely due to differences in the way that transaction costs are incorporated into the decision-making process. We present numerical simulations of the synchronization algorithm (RRAggU), demonstrating our theoretical findings.

Extensibility · INTERACT · INFORMS · TEAM · 優化器 ·

2022 年 2 月 21 日

The Role of Heterogeneity in Autonomous Perimeter Defense Problems

Aviv Adler,Oscar Mickelin,Ragesh K. Ramachandran,Gaurav S. Sukhatme,Sertac Karaman

from arxiv, 27 pages, 9 figures

When is heterogeneity in the composition of an autonomous robotic team beneficial and when is it detrimental? We investigate and answer this question in the context of a minimally viable model that examines the role of heterogeneous speeds in perimeter defense problems, where defenders share a total allocated speed budget. We consider two distinct problem settings and develop strategies based on dynamic programming and on local interaction rules. We present a theoretical analysis of both approaches and our results are extensively validated using simulations. Interestingly, our results demonstrate that the viability of heterogeneous teams depends on the amount of information available to the defenders. Moreover, our results suggest a universality property: across a wide range of problem parameters the optimal ratio of the speeds of the defenders remains nearly constant.

可辨認的 · Extensibility · TEAM · 估計/估計量 · 納什均衡 ·

2021 年 9 月 15 日

Decentralized and Communication-Free Multi-Robot Navigation through Distributed Games

Brian Reily,Terran Mott,Hao Zhang

Effective multi-robot teams require the ability to move to goals in complex environments in order to address real-world applications such as search and rescue. Multi-robot teams should be able to operate in a completely decentralized manner, with individual robot team members being capable of acting without explicit communication between neighbors. In this paper, we propose a novel game theoretic model that enables decentralized and communication-free navigation to a goal position. Robots each play their own distributed game by estimating the behavior of their local teammates in order to identify behaviors that move them in the direction of the goal, while also avoiding obstacles and maintaining team cohesion without collisions. We prove theoretically that generated actions approach a Nash equilibrium, which also corresponds to an optimal strategy identified for each robot. We show through extensive simulations that our approach enables decentralized and communication-free navigation by a multi-robot system to a goal position, and is able to avoid obstacles and collisions, maintain connectivity, and respond robustly to sensor noise.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

相互獨立(li)的(de)

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191