动漫AV观看网站不卡无码,欧美精品一区二区视频在线观看,亚洲欧美精品久久久,天天摸夜夜添夜夜添无码

This paper considers the linear-quadratic dual control problem where the system parameters need to be identified and the control objective needs to be optimized in the meantime. Contrary to existing works on data-driven linear-quadratic regulation, which typically provide error or regret bounds within a certain probability, we propose an online algorithm that guarantees the asymptotic optimality of the controller in the almost sure sense. Our dual control strategy consists of two parts: a switched controller with time-decaying exploration noise and Markov parameter inference based on the cross-correlation between the exploration noise and system output. Central to the almost sure performance guarantee is a safe switched control strategy that falls back to a known conservative but stable controller when the actual state deviates significantly from the target state. We prove that this switching strategy rules out any potential destabilizing controllers from being applied, while the performance gap between our switching strategy and the optimal linear state feedback is exponentially small. Under our dual control scheme, the parameter inference error scales as $O(T^{-1/4+\epsilon})$, while the suboptimality gap of control performance scales as $O(T^{-1/2+\epsilon})$, where $T$ is the number of time steps, and $\epsilon$ is an arbitrarily small positive number. Simulation results on an industrial process example are provided to illustrate the effectiveness of our proposed strategy.

相關內容

幾乎必然

關注 0

線性的 · 控制器 · 上置信界限 · 極大似然 · 極大似然估計 ·

2022 年 1 月 25 日

Augmented RBMLE-UCB Approach for Adaptive Control of Linear Quadratic Systems

Akshay Mete,Rahul Singh,P. R. Kumar

We consider the problem of controlling a stochastic linear system with quadratic costs, when its system parameters are not known to the agent -- called the adaptive LQG control problem. We re-examine an approach called "Reward-Biased Maximum Likelihood Estimate" (RBMLE) that was proposed more than forty years ago, and which predates the "Upper Confidence Bound" (UCB) method as well as the definition of "regret". It simply added a term favoring parameters with larger rewards to the estimation criterion. We propose an augmented approach that combines the penalty of the RBMLE method with the constraint of the UCB method, uniting the two approaches to optimization in the face of uncertainty. We first establish that theoretically this method retains $\mathcal{O}(\sqrt{T})$ regret, the best known so far. We show through a comprehensive simulation study that this augmented RBMLE method considerably outperforms the UCB and Thompson sampling approaches, with a regret that is typically less than 50\% of the better of their regrets. The simulation study includes all examples from earlier papers as well as a large collection of randomly generated systems.

控制器 · 分段 · 線性的 · 可辨認的 · UniFormer ·

2022 年 1 月 25 日

The Inverse Problem for Controlled Differential Equations

Anastasia Papavasiliou,Theodore Papamarkou,Yang Zhao

We study the problem of constructing the control driving a controlled differential equation from discrete observations of the response. By restricting the control to the space of piecewise linear paths, we identify the assumptions that ensure uniqueness. The main contribution of this paper is the introduction of a novel numerical algorithm for the construction of the piecewise linear control, that converges uniformly in time. Uniform convergence is needed for many applications and it is achieved by approaching the problem through the signature representation of the paths, which allows us to work with the whole path simultaneously.

Performer · 聯邦學習 · 學成 · Machine Learning · 可交換的 ·

2022 年 1 月 25 日

Stochastic Coded Federated Learning with Convergence and Privacy Guarantees

Yuchang Sun,Jiawei Shao,Songze Li,Yuyi Mao,Jun Zhang

Federated learning (FL) has attracted much attention as a privacy-preserving distributed machine learning framework, where many clients collaboratively train a machine learning model by exchanging model updates with a parameter server instead of sharing their raw data. Nevertheless, FL training suffers from slow convergence and unstable performance due to stragglers caused by the heterogeneous computational resources of clients and fluctuating communication rates. This paper proposes a coded FL framework, namely *stochastic coded federated learning* (SCFL) to mitigate the straggler issue. In the proposed framework, each client generates a privacy-preserving coded dataset by adding additive noise to the random linear combination of its local data. The server collects the coded datasets from all the clients to construct a composite dataset, which helps to compensate for the straggling effect. In the training process, the server as well as clients perform mini-batch stochastic gradient descent (SGD), and the server adds a make-up term in model aggregation to obtain unbiased gradient estimates. We characterize the privacy guarantee by the mutual information differential privacy (MI-DP) and analyze the convergence performance in federated learning. Besides, we demonstrate a privacy-performance tradeoff of the proposed SCFL method by analyzing the influence of the privacy constraint on the convergence rate. Finally, numerical experiments corroborate our analysis and show the benefits of SCFL in achieving fast convergence while preserving data privacy.

門控 · 控制器 · MoDELS · 循環網絡 · Networking ·

2022 年 1 月 24 日

Recurrent Neural Network-based Internal Model Control design for stable nonlinear systems

Fabio Bonassi,Riccardo Scattolini

from arxiv, This work has been submitted to Elsevier European Journal of Control for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Owing to their superior modeling capabilities, gated Recurrent Neural Networks, such as Gated Recurrent Units (GRUs) and Long Short-Term Memory networks (LSTMs), have become popular tools for learning dynamical systems. This paper aims to discuss how these networks can be adopted for the synthesis of Internal Model Control (IMC) architectures. To this end, first a gated recurrent network is used to learn a model of the unknown input-output stable plant. Then, a controller gated recurrent network is trained to approximate the model inverse. The stability of these networks, ensured by means of a suitable training procedure, allows to guarantee the input-output closed-loop stability. The proposed scheme is able to cope with the saturation of the control variables, and can be deployed on low-power embedded controllers, as it requires limited online computations. The approach is then tested on the Quadruple Tank benchmark system and compared to alternative control laws, resulting in remarkable closed-loop performances.

MoDELS · 可約的 · 蒙特卡羅 · ASSETS · 近似 ·

2022 年 1 月 24 日

The role of adaptivity in a numerical method for the Cox-Ingersoll-Ross model

Cónall Kelly,Gabriel Lord,Heru Maulana

from arxiv, 25 pages, 4 figures, 2 tables. This version submitted to the Journal of Computational and Applied Mathematics

We demonstrate the effectiveness of an adaptive explicit Euler method for the approximate solution of the Cox-Ingersoll-Ross model. This relies on a class of path-bounded timestepping strategies which work by reducing the stepsize as solutions approach a neighbourhood of zero. The method is hybrid in the sense that a convergent backstop method is invoked if the timestep becomes too small, or to prevent solutions from overshooting zero and becoming negative. Under parameter constraints that imply Feller's condition, we prove that such a scheme is strongly convergent, of order at least 1/2. Control of the strong error is important for multi-level Monte Carlo techniques. Under Feller's condition we also prove that the probability of ever needing the backstop method to prevent a negative value can be made arbitrarily small. Numerically, we compare this adaptive method to fixed step implicit and explicit schemes, and a novel semi-implicit adaptive variant. We observe that the adaptive approach leads to methods that are competitive in a domain that extends beyond Feller's condition, indicating suitability for the modelling of stochastic volatility in Heston-type asset models.

控制器 · Extensibility · MoDELS · 機器人 · Performer ·

2022 年 1 月 24 日

Hybrid Adaptive Control for Series Elastic Actuator of Humanoid Robot

Anh Khoa Lanh Luu,Van Tu Duong,Huy Hung Nguyen,Sang Bong Kim,Tan Tien Nguyen

Generally, humanoid robots usually suffer significant impact force when walking or running in a non-predefined environment that could easily damage the actuators due to high stiffness. In recent years, the usages of passive, compliant series elastic actuators (SEA) for driving humanoid's joints have proved the capability in many aspects so far. However, despite being widely applied in the biped robot research field, the stable control problem for a humanoid powered by the SEAs, especially in the walking process, is still a challenge. This paper proposes a model reference adaptive control (MRAC) combined with the backstepping algorithm to deal with the parameter uncertainties in a humanoid's lower limb driven by the SEA system. This is also an extension of our previous research (Lanh et al.,2021). Firstly, a dynamic model of SEA is obtained. Secondly, since there are unknown and uncertain parameters in the SEA model, a model reference adaptive controller (MRAC) is employed to guarantee the robust performance of the humanoid's lower limb. Finally, an experiment is carried out to evaluate the effectiveness of the proposed controller and the SEA mechanism.

線性的 · 優化器 · INFORMS · Color · 算法與數據結構 ·

2022 年 1 月 23 日

Mastermind with a Linear Number of Queries

Anders Martinsson,Pascal Su

Since the 60's Mastermind has been studied for the combinatorial and information theoretical interest the game has to offer. Many results have been discovered starting with Erd\H{o}s and R\'enyi determining the optimal number of queries needed for two colors. For $k$ colors and $n$ positions, Chv\'atal found asymptotically optimal bounds when $k \le n^{1-\epsilon}$. Following a sequence of gradual improvements for $k \geq n$ colors, the central open question is to resolve the gap between $\Omega(n)$ and $\mathcal{O}(n\log \log n)$ for $k=n$. In this paper, we resolve this gap by presenting the first algorithm for solving $k=n$ Mastermind with a linear number of queries. As a consequence, we are able to determine the query complexity of Mastermind for any parameters $k$ and $n$.

估計/估計量 · 線性的 · 噪聲 · 確切的 · 等分回歸 ·

2022 年 1 月 20 日

Noisy linear inverse problems under convex constraints: Exact risk asymptotics in high dimensions

Qiyang Han

In the standard Gaussian linear measurement model $Y=X\mu_0+\xi \in \mathbb{R}^m$ with a fixed noise level $\sigma>0$, we consider the problem of estimating the unknown signal $\mu_0$ under a convex constraint $\mu_0 \in K$, where $K$ is a closed convex set in $\mathbb{R}^n$. We show that the risk of the natural convex constrained least squares estimator (LSE) $\hat{\mu}(\sigma)$ can be characterized exactly in high dimensional limits, by that of the convex constrained LSE $\hat{\mu}_K^{\mathsf{seq}}$ in the corresponding Gaussian sequence model at a different noise level. The characterization holds (uniformly) for risks in the maximal regime that ranges from constant order all the way down to essentially the parametric rate, as long as certain necessary non-degeneracy condition is satisfied for $\hat{\mu}(\sigma)$. The precise risk characterization reveals a fundamental difference between noiseless (or low noise limit) and noisy linear inverse problems in terms of the sample complexity for signal recovery. A concrete example is given by the isotonic regression problem: While exact recovery of a general monotone signal requires $m\gg n^{1/3}$ samples in the noiseless setting, consistent signal recovery in the noisy setting requires as few as $m\gg \log n$ samples. Such a discrepancy occurs when the low and high noise risk behavior of $\hat{\mu}_K^{\mathsf{seq}}$ differ significantly. In statistical languages, this occurs when $\hat{\mu}_K^{\mathsf{seq}}$ estimates $0$ at a faster `adaptation rate' than the slower `worst-case rate' for general signals. Several other examples, including non-negative least squares and generalized Lasso (in constrained forms), are also worked out to demonstrate the concrete applicability of the theory in problems of different types.

優化器 · 值迭代 · CASES · 估計/估計量 · 路徑 ·

2021 年 4 月 22 日

Stochastic Shortest Path: Minimax, Parameter-Free and Towards Horizon-Free Regret

Jean Tarbouriech,Runlong Zhou,Simon S. Du,Matteo Pirotta,Michal Valko,Alessandro Lazaric

We study the problem of learning in the stochastic shortest path (SSP) setting, where an agent seeks to minimize the expected cost accumulated before reaching a goal state. We design a novel model-based algorithm EB-SSP that carefully skews the empirical transitions and perturbs the empirical costs with an exploration bonus to guarantee both optimism and convergence of the associated value iteration scheme. We prove that EB-SSP achieves the minimax regret rate $\widetilde{O}(B_{\star} \sqrt{S A K})$, where $K$ is the number of episodes, $S$ is the number of states, $A$ is the number of actions and $B_{\star}$ bounds the expected cumulative cost of the optimal policy from any state, thus closing the gap with the lower bound. Interestingly, EB-SSP obtains this result while being parameter-free, i.e., it does not require any prior knowledge of $B_{\star}$, nor of $T_{\star}$ which bounds the expected time-to-goal of the optimal policy from any state. Furthermore, we illustrate various cases (e.g., positive costs, or general costs when an order-accurate estimate of $T_{\star}$ is available) where the regret only contains a logarithmic dependence on $T_{\star}$, thus yielding the first horizon-free regret bound beyond the finite-horizon MDP setting.

坐標下降 · 優化器 · Performer · 學成 · 在線 ·

2018 年 7 月 16 日

Accelerated Randomized Coordinate Descent Algorithms for Stochastic Optimization and Online Learning

Akshita Bhandari,Chandramani Singh

from arxiv, 20 pages, 4 figures, 2 tables

We propose accelerated randomized coordinate descent algorithms for stochastic optimization and online learning. Our algorithms have significantly less per-iteration complexity than the known accelerated gradient algorithms. The proposed algorithms for online learning have better regret performance than the known randomized online coordinate descent algorithms. Furthermore, the proposed algorithms for stochastic optimization exhibit as good convergence rates as the best known randomized coordinate descent algorithms. We also show simulation results to demonstrate performance of the proposed algorithms.