久久久久久久精品少妇9999-岛国女人性爱免费毛片视频一二三区在线

This paper presents an approach for trajectory-centric learning control based on contraction metrics and disturbance estimation for nonlinear systems subject to matched uncertainties. The approach allows for the use of a broad class of model learning tools including deep neural networks to learn uncertain dynamics while still providing guarantees of transient tracking performance throughout the learning phase, including the special case of no learning. Within the proposed approach, a disturbance estimation law is proposed to estimate the pointwise value of the uncertainty, with pre-computable estimation error bounds (EEBs). The learned dynamics, the estimated disturbances, and the EEBs are then incorporated in a robust Riemannian energy condition to compute the control law that guarantees exponential convergence of actual trajectories to desired ones throughout the learning phase, even when the learned model is poor. On the other hand, with improved accuracy, the learned model can be incorporated in a high-level planner to plan better trajectories with improved performance, e.g., lower energy consumption and shorter travel time. The proposed framework is validated on a planar quadrotor navigation example.

相關內容

估計/估計量

關注 3

估計/估計量 · 離散化 · 近似 · 優化器 · 控制器 ·

2022 年 2 月 18 日

Analysis and approximations of an optimal control problem for the Allen-Cahn equation

Konstantinos Chrysafinos,Dimitra Plaka

The scope of this paper is the analysis and approximation of an optimal control problem related to the Allen-Cahn equation. A tracking functional is minimized subject to the Allen-Cahn equation using distributed controls that satisfy point-wise control constraints. First and second order necessary and sufficient conditions are proved. The lowest order discontinuous Galerkin - in time - scheme is considered for the approximation of the control to state and adjoint state mappings. Under a suitable restriction on maximum size of the temporal and spatial discretization parameters $k$, $h$ respectively in terms of the parameter $\epsilon$ that describes the thickness of the interface layer, a-priori estimates are proved with constants depending polynomially upon $1/ \epsilon$. Unlike to previous works for the uncontrolled Allen-Cahn problem our approach does not rely on a construction of an approximation of the spectral estimate, and as a consequence our estimates are valid under low regularity assumptions imposed by the optimal control setting. These estimates are also valid in cases where the solution and its discrete approximation do not satisfy uniform space-time bounds independent of $\epsilon$. These estimates and a suitable localization technique, via the second order condition (see \cite{Arada-Casas-Troltzsch_2002,Casas-Mateos-Troltzsch_2005,Casas-Raymond_2006,Casas-Mateos-Raymond_2007}), allows to prove error estimates for the difference between local optimal controls and their discrete approximation as well as between the associated state and adjoint state variables and their discrete approximations

state-of-the-art · 可行 · Performance · 優化器 · 塊 ·

2022 年 2 月 17 日

Dynamic Service Provisioning in the Edge-cloud Continuum with Provable Guarantees

Itamar Cohen,Carla Fabiana Chiasserini,Paolo Giaccone,Gabriel Scalosub

We consider a hierarchical edge-cloud architecture in which services are provided to mobile users as chains of virtual network functions. Each service has specific computation requirements and target delay performance, which require placing the corresponding chain properly and allocating a suitable amount of computing resources. Furthermore, chain migration may be necessary to meet the services' target delay, or convenient to keep the service provisioning cost low. We tackle such issues by formalizing the problem of optimal chain placement and resource allocation in the edge-cloud continuum, taking into account migration, bandwidth, and computation costs. Specifically, we first envision an algorithm that, leveraging resource augmentation, addresses the above problem and provides an upper bound to the amount of resources required to find a feasible solution. We use this algorithm as a building block to devise an efficient approach targeting the minimum-cost solution, while minimizing the required resource augmentation. Our results, obtained through trace-driven, large-scale simulations, show that our solution can provide a feasible solution by using half the amount of resources required by state-of-the-art alternatives.

控制器 · state-of-the-art · MoDELS · 最優化 · 優化器 ·

2022 年 2 月 17 日

Model Predictive Contouring Control for Near-Time-Optimal Quadrotor Flight

Angel Romero,Sihao Sun,Philipp Foehn,Davide Scaramuzza

from arxiv, 17 pages, 16 figures. Video: //www.youtube.com/watch?v=mHDQcckqdg4

We tackle the problem of flying time-optimal trajectories through multiple waypoints with quadrotors. State-of-the-art solutions split the problem into a planning task - where a global, time-optimal trajectory is generated - and a control task - where this trajectory is accurately tracked. However, at the current state, generating a time-optimal trajectory that considers the full quadrotor model requires solving a difficult time allocation problem via optimization, which is computationally demanding (in the order of minutes or even hours). This is detrimental for replanning in presence of disturbances. We overcome this issue by solving the time allocation problem and the control problem concurrently via Model Predictive Contouring Control (MPCC). Our MPCC optimally selects the future states of the platform at runtime, while maximizing the progress along the reference path and minimizing the distance to it. We show that, even when tracking simplified trajectories, the proposed MPCC results in a path that approaches the true time-optimal one, and which can be generated in real-time. We validate our approach in the real world, where we show that our method outperforms both the current state-of-the-art and a world-class human pilot in terms of lap time achieving speeds of up to 60 km/h.

可辨認的 · 控制器 · 離散化 · 前向傳播/正向傳播 · TOOLS ·

2022 年 2 月 17 日

Morse Graphs: Topological Tools for Analyzing the Global Dynamics of Robot Controllers

Ewerton R. Vieira,Edgar Granados,Aravind Sivaramakrishnan,Marcio Gameiro,Konstantin Mischaikow,Kostas E. Bekris

Understanding the global dynamics of a robot controller, such as identifying attractors and their regions of attraction (RoA), is important for safe deployment and synthesizing more effective hybrid controllers. This paper proposes a topological framework to analyze the global dynamics of robot controllers, even data-driven ones, in an effective and explainable way. It builds a combinatorial representation representing the underlying system's state space and non-linear dynamics, which is summarized in a directed acyclic graph, the Morse graph. The approach only probes the dynamics locally by forward propagating short trajectories over a state-space discretization, which needs to be a Lipschitz-continuous function. The framework is evaluated given either numerical or data-driven controllers for classical robotic benchmarks. It is compared against established analytical and recent machine learning alternatives for estimating the RoAs of such controllers. It is shown to outperform them in accuracy and efficiency. It also provides deeper insights as it describes the global dynamics up to the discretization's resolution. This allows to use the Morse graph to identify how to synthesize controllers to form improved hybrid solutions or how to identify the physical limitations of a robotic system.

估計/估計量 · 參數化模型 · MoDELS · 設計 · 情景 ·

2022 年 2 月 16 日

Simplified algorithms for adaptive experiment design in parameter estimation

Robert D. McMichael,Sean M. Blakley

from arxiv, 10 pages, 8 figures

In experiments to estimate parameters of a parametric model, Bayesian experiment design allows measurement settings to be chosen based on utility, which is the predicted improvement of parameter distributions due to modeled measurement results. In this paper we compare information-theory-based utility with three alternative utility algorithms. Tests of these utility alternatives in simulated adaptive measurements demonstrate large improvements in computational speed with slight impacts on measurement efficiency.

可約的 · 置信度 · Performer · Extensibility · INFORMS ·

2022 年 2 月 16 日

Safe Active Dynamics Learning and Control: A Sequential Exploration-Exploitation Framework

Thomas Lew,Apoorva Sharma,James Harrison,Andrew Bylard,Marco Pavone

from arxiv, Accepted as a Regular Paper to the IEEE Transactions on Robotics (T-RO)

Safe deployment of autonomous robots in diverse scenarios requires agents that are capable of efficiently adapting to new environments while satisfying constraints. In this work, we propose a practical and theoretically-justified approach to maintaining safety in the presence of dynamics uncertainty. Our approach leverages Bayesian meta-learning with last-layer adaptation. The expressiveness of neural-network features trained offline, paired with efficient last-layer online adaptation, enables the derivation of tight confidence sets which contract around the true dynamics as the model adapts online. We exploit these confidence sets to plan trajectories that guarantee the safety of the system. Our approach handles problems with high dynamics uncertainty, where reaching the goal safely is potentially initially infeasible, by first \textit{exploring} to gather data and reduce uncertainty, before autonomously \textit{exploiting} the acquired information to safely perform the task. Under reasonable assumptions, we prove that our framework guarantees the high-probability satisfaction of all constraints at all times jointly, i.e. over the total task duration. This theoretical analysis also motivates two regularizers of last-layer meta-learning models that improve online adaptation capabilities as well as performance by reducing the size of the confidence sets. We extensively demonstrate our approach in simulation and on hardware.

contrastive · 推斷 · Performer · Better · 可約的 ·

2021 年 10 月 19 日

Contrastive Active Inference

Pietro Mazzaglia,Tim Verbelen,Bart Dhoedt

from arxiv, Accepted as a conference paper at 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

Active inference is a unifying theory for perception and action resting upon the idea that the brain maintains an internal model of the world by minimizing free energy. From a behavioral perspective, active inference agents can be seen as self-evidencing beings that act to fulfill their optimistic predictions, namely preferred outcomes or goals. In contrast, reinforcement learning requires human-designed rewards to accomplish any desired outcome. Although active inference could provide a more natural self-supervised objective for control, its applicability has been limited because of the shortcomings in scaling the approach to complex environments. In this work, we propose a contrastive objective for active inference that strongly reduces the computational burden in learning the agent's generative model and planning future actions. Our method performs notably better than likelihood-based active inference in image-based tasks, while also being computationally cheaper and easier to train. We compare to reinforcement learning agents that have access to human-designed reward functions, showing that our approach closely matches their performance. Finally, we also show that contrastive methods perform significantly better in the case of distractors in the environment and that our method is able to generalize goals to variations in the background.

泛函 · 約束 · 強化學習 · Q函數 · 學成 ·

2021 年 6 月 24 日

Density Constrained Reinforcement Learning

Zengyi Qin,Yuxiao Chen,Chuchu Fan

from arxiv, Accepted by ICML, 2021

We study constrained reinforcement learning (CRL) from a novel perspective by setting constraints directly on state density functions, rather than the value functions considered by previous works. State density has a clear physical and mathematical interpretation, and is able to express a wide variety of constraints such as resource limits and safety requirements. Density constraints can also avoid the time-consuming process of designing and tuning cost functions required by value function-based constraints to encode system specifications. We leverage the duality between density functions and Q functions to develop an effective algorithm to solve the density constrained RL problem optimally and the constrains are guaranteed to be satisfied. We prove that the proposed algorithm converges to a near-optimal solution with a bounded error even when the policy update is imperfect. We use a set of comprehensive experiments to demonstrate the advantages of our approach over state-of-the-art CRL methods, with a wide range of density constrained tasks as well as standard CRL benchmarks such as Safety-Gym.

Continuity · 評論員 · 估計/估計量 · 正則化項 · 欠估計 ·

2021 年 6 月 6 日

Efficient Continuous Control with Double Actors and Regularized Critics

Jiafei Lyu,Xiaoteng Ma,Jiangpeng Yan,Xiu Li

from arxiv, 21 pages

How to obtain good value estimation is one of the key problems in Reinforcement Learning (RL). Current value estimation methods, such as DDPG and TD3, suffer from unnecessary over- or underestimation bias. In this paper, we explore the potential of double actors, which has been neglected for a long time, for better value function estimation in continuous setting. First, we uncover and demonstrate the bias alleviation property of double actors by building double actors upon single critic and double critics to handle overestimation bias in DDPG and underestimation bias in TD3 respectively. Next, we interestingly find that double actors help improve the exploration ability of the agent. Finally, to mitigate the uncertainty of value estimate from double critics, we further propose to regularize the critic networks under double actors architecture, which gives rise to Double Actors Regularized Critics (DARC) algorithm. Extensive experimental results on challenging continuous control tasks show that DARC significantly outperforms state-of-the-art methods with higher sample efficiency.

受限玻爾茲曼機 · 評論員 · SPIN · Machine Learning · 特征提取 ·

2018 年 10 月 18 日

Thermodynamics and Feature Extraction by Machine Learning

Shotaro Shiba Funai,Dimitrios Giataganas

from arxiv, 11 pages, double column format, 10 figures

Machine learning methods are powerful in distinguishing different phases of matter in an automated way and provide a new perspective on the study of physical phenomena. We train a Restricted Boltzmann Machine (RBM) on data constructed with spin configurations sampled from the Ising Hamiltonian at different values of temperature and external magnetic field using Monte Carlo methods. From the trained machine we obtain the flow of iterative reconstruction of spin state configurations to faithfully reproduce the observables of the physical system. We find that the flow of the trained RBM approaches the spin configurations of the maximal possible specific heat which resemble the near criticality region of the Ising model. In the special case of the vanishing magnetic field the trained RBM converges to the critical point of the Renormalization Group (RG) flow of the lattice model. Our results suggest an alternative explanation of how the machine identifies the physical phase transitions, by recognizing certain properties of the configuration like the maximization of the specific heat, instead of associating directly the recognition procedure with the RG flow and its fixed points. Then from the reconstructed data we deduce the critical exponent associated to the magnetization to find satisfactory agreement with the actual physical value. We assume no prior knowledge about the criticality of the system and its Hamiltonian.