97SE亚洲国产综合在线_影888午夜理论不卡_国产精品VA在线观看无码电影_多男一女一级伦奷视频免费_国产高清精品软件丝瓜软件_亚洲第一区二区三区中文字幕在线_亚洲欧美国产中文

Bio-inspired sensorimotor control systems may be appealing to roboticists who try to solve problems of multiDOF humanoids and human-robot interactions. This paper presents a simple posture control concept from neuroscience, called disturbance estimation and compensation, DEC concept [1]. It provides human-like mechanical compliance due to low loop gain, tolerance of time delays, and automatic adjustment to changes in external disturbance scenarios. Its outstanding feature is that it uses feedback of multisensory disturbance estimates rather than 'raw' sensory signals for disturbance compensation. After proof-of-principle tests in 1 and 2 DOF posture control robots, we present here a generalized DEC control module for multi-DOF robots. In the control layout, one DEC module controls one DOF (modular control architecture). Modules of neighboring joints are synergistically interconnected using vestibular information in combination with joint angle and torque signals. These sensory interconnections allow each module to control the kinematics of the more distal links as if they were a single link. This modular design makes the complexity of the robot control scale linearly with the DOFs and error robustness high compared to monolithic control architectures. The presented concept uses Matlab/Simulink (The MathWorks, Natick, USA) for both, model simulation and robot control and will be available as open library

相關內容

控制器

關注 5

SOFT · 控制器 · Principle · SimPLe · 分離的 ·

2022 年 2 月 8 日

Tube-Balloon Logic for the Exploration of Fluidic Control Elements

Jovanna A. Tracz,Lukas Wille,Dylan Pathiraja,Savita V. Kendre,Ron Pfisterer,Ethan Turett,Gus T. Teran,Christoffer K. Abrahamsson,Samuel E. Root,Won-Kyu Lee,Daniel J. Preston,Haihui Joy Jiang,George M. Whitesides,Markus P. Nemitz

from arxiv, Accepted manuscript (journal): Robotics and Automation Letter, 2022

The control of pneumatically driven soft robots typically requires electronics. Microcontrollers are connected to power electronics that switch valves and pumps on and off. As a recent alternative, fluidic control methods have been introduced, in which soft digital logic gates permit multiple actuation states to be achieved in soft systems. Such systems have demonstrated autonomous behaviors without the use of electronics. However, fluidic controllers have required complex fabrication processes. To democratize the exploration of fluidic controllers, we developed tube-balloon logic circuitry, which consists of logic gates made from straws and balloons. Each tube-balloon logic device takes a novice five minutes to fabricate and costs $0.45. Tube-balloon logic devices can also operate at pressures of up to 200 kPa and oscillate at frequencies of up to 15 Hz. We configure the tube-balloon logic device as NOT-, NAND-, and NOR-gates and assemble them into a three-ring oscillator to demonstrate a vibrating sieve that separates sugar from rice. Because tube-balloon logic devices are low-cost, easy to fabricate, and their operating principle is simple, they are well suited for exploring fundamental concepts of fluidic control schemes while encouraging design inquiry for pneumatically driven soft robots

學成 · Automator · 路徑 · Performer · Better ·

2022 年 2 月 8 日

Multi-Agent Path Finding with Prioritized Communication Learning

Wenhao Li,Hongjun Chen,Bo Jin,Wenzhe Tan,Hongyuan Zha,Xiangfeng Wang

from arxiv, 7 pages, 5 figures, 4 tables, published at ICRA 2022

Multi-agent path finding (MAPF) has been widely used to solve large-scale real-world problems, e.g. automation warehouse. The learning-based fully decentralized framework has been introduced to simultaneously alleviate real-time problem and pursuit the optimal planning policy. However, existing methods might generate significantly more vertex conflicts (called collision), which lead to low success rate or more makespan. In this paper, we propose a PrIoritized COmmunication learning method (PICO), which incorporates the implicit planning priorities into the communication topology within the decentralized multi-agent reinforcement learning framework. Assembling with the classic coupled planners, the implicit priority learning module can be utilized to form the dynamic communication topology, which also build an effective collision-avoiding mechanism. PICO performs significantly better in large-scale multi-agent path finding tasks in both success rates and collision rates than state-of-the-art learning-based planners.

控制器 · 整數線性規劃 · 約束 · 確定性策略 · 馬爾可夫鏈 ·

2022 年 2 月 7 日

Controller Synthesis for Omega-Regular and Steady-State Specifications

Alvaro Velasquez,Ismail Alkhouri,Andre Beckus,Ashutosh Trivedi,George Atia

Given a Markov decision process (MDP) and a linear-time ($\omega$-regular or LTL) specification, the controller synthesis problem aims to compute the optimal policy that satisfies the specification. More recently, problems that reason over the asymptotic behavior of systems have been proposed through the lens of steady-state planning. This entails finding a control policy for an MDP such that the Markov chain induced by the solution policy satisfies a given set of constraints on its steady-state distribution. This paper studies a generalization of the controller synthesis problem for a linear-time specification under steady-state constraints on the asymptotic behavior. We present an algorithm to find a deterministic policy satisfying $\omega$-regular and steady-state constraints by characterizing the solutions as an integer linear program, and experimentally evaluate our approach.

控制器 · 線性的 · INFORMS · Performer · 輸出 ·

2022 年 2 月 6 日

Covertly Controlling a Linear System

Barak Amihood,Asaf Cohen

Consider the problem of covertly controlling a linear system. In this problem, Alice desires to control (stabilize or change the parameters of) a linear system, while keeping an observer, Willie, unable to decide if the system is indeed being controlled or not. We formally define the problem, under two different models: (i) When Willie can only observe the system's output (ii) When Willie can directly observe the control signal. Focusing on AR(1) systems, we show that when Willie observes the system's output through a clean channel, an inherently unstable linear system can not be covertly stabilized. However, an inherently stable linear system can be covertly controlled, in the sense of covertly changing its parameter. Moreover, we give direct and converse results for two important controllers: a minimal-information controller, where Alice is allowed to used only $1$ bit per sample, and a maximal-information controller, where Alice is allowed to view the real-valued output. Unlike covert communication, where the trade-off is between rate and covertness, the results reveal an interesting \emph{three--fold} trade--off in covert control: the amount of information used by the controller, control performance and covertness. To the best of our knowledge, this is the first study formally defining covert control.

GROUP · 卡爾曼濾波 · 狀態估計 · 狀態空間 · 不變 ·

2022 年 2 月 6 日

The Geometry of Navigation Problems

Axel Barrau,Silvere Bonnabel

from arxiv, Published in IEEE Transactions on Automatic Control, 21 January 2022

While many works exploiting an existing Lie group structure have been proposed for state estimation, in particular the Invariant Extended Kalman Filter (IEKF), few papers address the construction of a group structure that allows casting a given system into the framework of invariant filtering. In this paper we introduce a large class of systems encompassing most problems involving a navigating vehicle encountered in practice. For those systems we introduce a novel methodology that systematically provides a group structure for the state space, including vectors of the body frame such as biases. We use it to derive observers having properties akin to those of linear observers or filters. The proposed unifying and versatile framework encompasses all systems where IEKF has proved successful, improves state-of-the art "imperfect" IEKF for inertial navigation with sensor biases, and allows addressing novel examples, like GNSS antenna lever arm estimation.

Performer · 二分法 · 分離的 · 大學 · 表示 ·

2022 年 2 月 4 日

A Rich Type System for Quantum Programs

Aarthi Sundaram,Robert Rand,Kartik Singhal,Brad Lackey

from arxiv, 49 pages, 3 figures

We show that Gottesman's semantics (GROUP22, 1998) for Clifford circuits based on the Heisenberg representation can be treated as a type system that can efficiently characterize a common subset of quantum programs. Our applications include (i) certifying whether auxiliary qubits can be safely disposed of, (ii) determining if a system is separable across a given bi-partition, (iii) checking the transversality of a gate with respect to a given stabilizer code, and (iv) typing post-measurement states for computational basis measurements. Further, this type system is extended to accommodate universal quantum computing by deriving types for the $T$-gate, multiply-controlled unitaries such as the Toffoli gate, and some gate injection circuits that use associated magic states. These types allow us to prove a lower bound on the number of $T$ gates necessary to perform a multiply-controlled $Z$ gate.

多樣性 · 學成 · state-of-the-art · MoDELS · 張成子空間 ·

2021 年 3 月 14 日

Modelling Behavioural Diversity for Learning in Open-Ended Games

Nicolas Perez Nieves,Yaodong Yang,Oliver Slumbers,David Henry Mguni,Jun Wang

from arxiv, corresponds to <[email protected]>

Promoting behavioural diversity is critical for solving games with non-transitive dynamics where strategic cycles exist, and there is no consistent winner (e.g., Rock-Paper-Scissors). Yet, there is a lack of rigorous treatment for defining diversity and constructing diversity-aware learning dynamics. In this work, we offer a geometric interpretation of behavioural diversity in games and introduce a novel diversity metric based on \emph{determinantal point processes} (DPP). By incorporating the diversity metric into best-response dynamics, we develop \emph{diverse fictitious play} and \emph{diverse policy-space response oracle} for solving normal-form games and open-ended games. We prove the uniqueness of the diverse best response and the convergence of our algorithms on two-player games. Importantly, we show that maximising the DPP-based diversity metric guarantees to enlarge the \emph{gamescape} -- convex polytopes spanned by agents' mixtures of strategies. To validate our diversity-aware solvers, we test on tens of games that show strong non-transitivity. Results suggest that our methods achieve much lower exploitability than state-of-the-art solvers by finding effective and diverse strategies.

Extensibility · 學成 · 控制器 · 強化學習 · 確定性策略 ·

2018 年 7 月 10 日

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

Xiaodan Liang,Tairui Wang,Luona Yang,Eric Xing

from arxiv, To appear in ECCV 2018

Autonomous urban driving navigation with complex multi-agent dynamics is under-explored due to the difficulty of learning an optimal driving policy. The traditional modular pipeline heavily relies on hand-designed rules and the pre-processing perception system while the supervised learning-based models are limited by the accessibility of extensive human experience. We present a general and principled Controllable Imitative Reinforcement Learning (CIRL) approach which successfully makes the driving agent achieve higher success rates based on only vision inputs in a high-fidelity car simulator. To alleviate the low exploration efficiency for large continuous action space that often prohibits the use of classical RL on challenging real tasks, our CIRL explores over a reasonably constrained action space guided by encoded experiences that imitate human demonstrations, building upon Deep Deterministic Policy Gradient (DDPG). Moreover, we propose to specialize adaptive policies and steering-angle reward designs for different control signals (i.e. follow, straight, turn right, turn left) based on the shared representations to improve the model capability in tackling with diverse cases. Extensive experiments on CARLA driving benchmark demonstrate that CIRL substantially outperforms all previous methods in terms of the percentage of successfully completed episodes on a variety of goal-directed driving tasks. We also show its superior generalization capability in unseen environments. To our knowledge, this is the first successful case of the learned driving policy through reinforcement learning in the high-fidelity simulator, which performs better-than supervised imitation learning.

分層強化學習 · 學成 · INFORMS · 強化學習 · Performer ·

2018 年 5 月 18 日

Hierarchical Reinforcement Learning with Deep Nested Agents

Marc Brittain,Peng Wei

from arxiv, 11 pages

Deep hierarchical reinforcement learning has gained a lot of attention in recent years due to its ability to produce state-of-the-art results in challenging environments where non-hierarchical frameworks fail to learn useful policies. However, as problem domains become more complex, deep hierarchical reinforcement learning can become inefficient, leading to longer convergence times and poor performance. We introduce the Deep Nested Agent framework, which is a variant of deep hierarchical reinforcement learning where information from the main agent is propagated to the low level $nested$ agent by incorporating this information into the nested agent's state. We demonstrate the effectiveness and performance of the Deep Nested Agent framework by applying it to three scenarios in Minecraft with comparisons to a deep non-hierarchical single agent framework, as well as, a deep hierarchical framework.

INTERACT · 學成 · Neural Networks · Networking · 控制器 ·

2018 年 4 月 23 日

Neural Network Based Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction

Stéphane Lathuilière,Benoit Massé,Pablo Mesejo,Radu Horaud

from arxiv, Paper submitted to Pattern Recognition Letters

This paper introduces a novel neural network-based reinforcement learning approach for robot gaze control. Our approach enables a robot to learn and to adapt its gaze control strategy for human-robot interaction neither with the use of external sensors nor with human supervision. The robot learns to focus its attention onto groups of people from its own audio-visual experiences, independently of the number of people, of their positions and of their physical appearances. In particular, we use a recurrent neural network architecture in combination with Q-learning to find an optimal action-selection policy; we pre-train the network using a simulated environment that mimics realistic scenarios that involve speaking/silent participants, thus avoiding the need of tedious sessions of a robot interacting with people. Our experimental evaluation suggests that the proposed method is robust against parameter estimation, i.e. the parameter values yielded by the method do not have a decisive impact on the performance. The best results are obtained when both audio and visual information is jointly used. Experiments with the Nao robot indicate that our framework is a step forward towards the autonomous learning of socially acceptable gaze behavior.