91精品综合久久久久久五月天_精品夜色国产国偷自产乱码_伊人久久大香线蕉午夜AV一区_中国一级毛片免费的_日本三级在线视频一区二区_欧美日韩国产高清在线观看_国产99久久久国产精品性色

This paper aims to improve the path quality and computational efficiency of sampling-based kinodynamic planners for vehicular navigation. It proposes a learning framework for identifying promising controls during the expansion process of sampling-based planners. Given a dynamics model, a reinforcement learning process is trained offline to return a low-cost control that reaches a local goal state (i.e., a waypoint) in the absence of obstacles. By focusing on the system's dynamics and not knowing the environment, this process is data-efficient and takes place once for a robotic system. In this way, it can be reused in different environments. The planner generates online local goal states for the learned controller in an informed manner to bias towards the goal and consecutively in an exploratory, random manner. For the informed expansion, local goal states are generated either via (a) medial axis information in environments with obstacles, or (b) wavefront information for setups with traversability costs. The learning process and the resulting planning framework are evaluated for a first and second-order differential drive system, as well as a physically simulated Segway robot. The results show that the proposed integration of learning and planning can produce higher quality paths than sampling-based kinodynamic planning with random controls in fewer iterations and computation time.

相關內容

INFORMS

關注 10

《計算機信息》雜志發表高質量的論文，擴大了運籌學和計算的范圍，尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文，以及描述新的和有用的軟件工具的論文。官網鏈接： · 回合 · 路徑 · INFORMS · 穩健性 ·

2021 年 12 月 2 日

OpenStreetMap-based Autonomous Navigation With LiDAR Naive-Valley-Path Obstacle Avoidance

Miguel Angel Munoz-Banon,Edison Velasco-Sanchez,Francisco A. Candelas,Fernando Torres

from arxiv, This paper was submitted to Elsevier's Measurement journal and is currently under review

In this paper, we present a complete autonomous navigation pipeline for unstructured outdoor environments. The main contribution of this work is on the path planning module, which we divided into two main categories: Global Path Planning (GPP) and Local Path Planning (LPP). For environment representation, instead of complex and heavy grid maps, the GPP layer uses road network information obtained directly from OpenStreetMaps (OSM). In the LPP layer, we use a novel Naive-Valley-Path (NVP) method to generate a local path avoiding obstacles in the road in real-time. This approach uses a naive representation of the local environment using a LiDAR sensor. Also, it uses a naive optimization that exploits the concept of "valley" areas in the cost map. We demonstrate the system's robustness experimentally in our research platform BLUE, driving autonomously across the University of Alicante Scientific Park for more than 20 km in a 12.33 ha area.

控制器 · 可約的 · 樣本 · 機器人 · 動力系統 ·

2021 年 12 月 2 日

Control of over-redundant cooperative manipulation via sampled communication

Enrica Rossi,Marco Tognon,Ruggero Carli,Antonio Franchi,Luca Schenato

from arxiv, 8 pages, 6 figures

In this work we consider the problem of mobile robots that need to manipulate/transport an object via cables or robotic arms. We consider the scenario where the number of manipulating robots is redundant, i.e. a desired object configuration can be obtained by different configurations of the robots. The objective of this work is to show that communication can be used to implement cooperative local feedback controllers in the robots to improve disturbance rejection and reduce structural stress in the object. In particular we consider the realistic scenario where measurements are sampled and transmitted over wireless, and the sampling period is comparable with the system dynamics time constants. We first propose a kinematic model which is consistent with the overall systems dynamics under high-gain control and then we provide sufficient conditions for the exponential stability and monotonic decrease of the configuration error under different norms. Finally, we test the proposed controllers on the full dynamical systems showing the benefit of local communication.

線性的 · 線性映射 · 線性組合 · 代價 · 跡 ·

2021 年 12 月 2 日

Physical Implementability of Linear Maps and Its Application in Error Mitigation

Jiaqing Jiang,Kun Wang,Xin Wang

from arxiv, 25 pages, v2 accepted by Quantum

Completely positive and trace-preserving maps characterize physically implementable quantum operations. On the other hand, general linear maps, such as positive but not completely positive maps, which can not be physically implemented, are fundamental ingredients in quantum information, both in theoretical and practical perspectives. This raises the question of how well one can simulate or approximate the action of a general linear map by physically implementable operations. In this work, we introduce a systematic framework to resolve this task using the quasiprobability decomposition technique. We decompose a target linear map into a linear combination of physically implementable operations and introduce the physical implementability measure as the least amount of negative portion that the quasiprobability must pertain, which directly quantifies the cost of simulating a given map using physically implementable quantum operations. We show this measure is efficiently computable by semidefinite programs and prove several properties of this measure, such as faithfulness, additivity, and unitary invariance. We derive lower and upper bounds in terms of the Choi operator's trace norm and obtain analytic expressions for several linear maps of practical interests. Furthermore, we endow this measure with an operational meaning within the quantum error mitigation scenario: it establishes the lower bound of the sampling cost achievable via the quasiprobability decomposition technique. In particular, for parallel quantum noises, we show that global error mitigation has no advantage over local error mitigation.

控制器 · 穩健性 · Processing（編程語言） · MoDELS · 卡爾曼濾波 ·

2021 年 11 月 30 日

Delay-aware Robust Control for Safe Autonomous Driving

Dvij Kalaria,Qin Lin,John M. Dolan

from arxiv, Under review at ICRA 2022

With the advancement of affordable self-driving vehicles using complicated nonlinear optimization but limited computation resources, computation time becomes a matter of concern. Other factors such as actuator dynamics and actuator command processing cost also unavoidably cause delays. In high-speed scenarios, these delays are critical to the safety of a vehicle. Recent works consider these delays individually, but none unifies them all in the context of autonomous driving. Moreover, recent works inappropriately consider computation time as a constant or a large upper bound, which makes the control either less responsive or over-conservative. To deal with all these delays, we present a unified framework by 1) modeling actuation dynamics, 2) using robust tube model predictive control, 3) using a novel adaptive Kalman filter without assuminga known process model and noise covariance, which makes the controller safe while minimizing conservativeness. On onehand, our approach can serve as a standalone controller; on theother hand, our approach provides a safety guard for a high-level controller, which assumes no delay. This can be used for compensating the sim-to-real gap when deploying a black-box learning-enabled controller trained in a simplistic environment without considering delays for practical vehicle systems.

控制器 · 環 · 曲率 · Performer · tuning ·

2021 年 11 月 30 日

A Hierarchical Control Framework for Drift Maneuvering of Autonomous Vehicles

Bo Yang,Yiwen Lu,Xu Yang,Yilin Mo

Drift control is significant to the safety of autonomous vehicles when there is a sudden loss of traction due to external conditions such as rain or snow. It is a challenging control problem due to the presence of significant sideslip and nearly full saturation of the tires. In this paper, we focus on the control of drift maneuvers following circular paths with either fixed or moving centers, subject to change in the tire-ground interaction, which are common training tasks for drift enthusiasts and can therefore be used as benchmarks of the performance of drift control. In order to achieve the above tasks, we propose a novel hierarchical control architecture which decouples the curvature and center control of the trajectory. In particular, an outer loop stabilizes the center by tuning the target curvature, and an inner loop tracks the curvature using a feedforward/feedback controller enhanced by an $\mathcal{L}_1$ adaptive component. The hierarchical architecture is flexible because the inner loop is task-agnostic and adaptive to changes in tire-road interaction, which allows the outer loop to be designed independent of low-level dynamics, opening up the possibility of incorporating sophisticated planning algorithms. We implement our control strategy on a simulation platform as well as on a 1/10 scale Radio-Control~(RC) car, and both the simulation and experiment results illustrate the effectiveness of our strategy in achieving the above described set of drift maneuvering tasks.

回合 · Continuity · 機器人 · 峰值 · 控制器 ·

2021 年 11 月 29 日

Hybrid Feedback for Autonomous Navigation in Environments with Arbitrary Convex Obstacles

Mayur Sawant,Soulaimane Berkane,Ilia Polusin,Abdelhamid Tayebi

from arxiv, 18 pages, 11 figures

We develop an autonomous navigation algorithm for a robot operating in two-dimensional environments cluttered with obstacles having arbitrary convex shapes. The proposed navigation approach relies on a hybrid feedback to guarantee global asymptotic stabilization of the robot towards a predefined target location while ensuring the forward invariance of the obstacle-free workspace. The main idea consists in designing an appropriate switching strategy between the move-to-target mode and the obstacle-avoidance mode based on the proximity of the robot with respect to the nearest obstacle. The proposed hybrid controller generates continuous velocity input trajectories when the robot is initialized away from the boundaries of the unsafe regions. Finally, we provide an algorithmic procedure for the sensor-based implementation of the proposed hybrid controller and validate its effectiveness through some simulation results.

Taxonomy · Performer · 學成 · 優化器 · Processing（編程語言） ·

2021 年 6 月 23 日

Imitation Learning: Progress, Taxonomies and Opportunities

Boyuan Zheng,Sunny Verma,Jianlong Zhou,Ivor Tsang,Fang Chen

Imitation learning aims to extract knowledge from human experts' demonstrations or artificially created agents in order to replicate their behaviors. Its success has been demonstrated in areas such as video games, autonomous driving, robotic simulations and object manipulation. However, this replicating process could be problematic, such as the performance is highly dependent on the demonstration quality, and most trained agents are limited to perform well in task-specific environments. In this survey, we provide a systematic review on imitation learning. We first introduce the background knowledge from development history and preliminaries, followed by presenting different taxonomies within Imitation Learning and key milestones of the field. We then detail challenges in learning strategies and present research opportunities with learning policy from suboptimal demonstration, voice instructions and other associated optimization schemes.

Extensibility · 學成 · 控制器 · 強化學習 · 確定性策略 ·

2018 年 7 月 10 日

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

Xiaodan Liang,Tairui Wang,Luona Yang,Eric Xing

from arxiv, To appear in ECCV 2018

Autonomous urban driving navigation with complex multi-agent dynamics is under-explored due to the difficulty of learning an optimal driving policy. The traditional modular pipeline heavily relies on hand-designed rules and the pre-processing perception system while the supervised learning-based models are limited by the accessibility of extensive human experience. We present a general and principled Controllable Imitative Reinforcement Learning (CIRL) approach which successfully makes the driving agent achieve higher success rates based on only vision inputs in a high-fidelity car simulator. To alleviate the low exploration efficiency for large continuous action space that often prohibits the use of classical RL on challenging real tasks, our CIRL explores over a reasonably constrained action space guided by encoded experiences that imitate human demonstrations, building upon Deep Deterministic Policy Gradient (DDPG). Moreover, we propose to specialize adaptive policies and steering-angle reward designs for different control signals (i.e. follow, straight, turn right, turn left) based on the shared representations to improve the model capability in tackling with diverse cases. Extensive experiments on CARLA driving benchmark demonstrate that CIRL substantially outperforms all previous methods in terms of the percentage of successfully completed episodes on a variety of goal-directed driving tasks. We also show its superior generalization capability in unseen environments. To our knowledge, this is the first successful case of the learned driving policy through reinforcement learning in the high-fidelity simulator, which performs better-than supervised imitation learning.

獎勵函數 · 線性的 · 強化學習 · 學成 · 值迭代 ·

2018 年 4 月 22 日

Logically-Constrained Reinforcement Learning

Mohammadhosein Hasanbeig,Alessandro Abate,Daniel Kroening

This paper proposes a Reinforcement Learning (RL) algorithm to synthesize policies for a Markov Decision Process (MDP), such that a linear time property is satisfied. We convert the property into a Limit Deterministic Buchi Automaton (LDBA), then construct a product MDP between the automaton and the original MDP. A reward function is then assigned to the states of the product automaton, according to accepting conditions of the LDBA. With this reward function, our algorithm synthesizes a policy that satisfies the linear time property: as such, the policy synthesis procedure is "constrained" by the given specification. Additionally, we show that the RL procedure sets up an online value iteration method to calculate the maximum probability of satisfying the given property, at any given state of the MDP - a convergence proof for the procedure is provided. Finally, the performance of the algorithm is evaluated via a set of numerical examples. We observe an improvement of one order of magnitude in the number of iterations required for the synthesis compared to existing approaches.

學成 · 控制器 · MoDELS · 在線 · 元學習 ·

2018 年 3 月 30 日

Learning to Adapt: Meta-Learning for Model-Based Control

Ignasi Clavera,Anusha Nagabandi,Ronald S. Fearing,Pieter Abbeel,Sergey Levine,Chelsea Finn

Although reinforcement learning methods can achieve impressive results in simulation, the real world presents two major challenges: generating samples is exceedingly expensive, and unexpected perturbations can cause proficient but narrowly-learned policies to fail at test time. In this work, we propose to learn how to quickly and effectively adapt online to new situations as well as to perturbations. To enable sample-efficient meta-learning, we consider learning online adaptation in the context of model-based reinforcement learning. Our approach trains a global model such that, when combined with recent data, the model can be be rapidly adapted to the local context. Our experiments demonstrate that our approach can enable simulated agents to adapt their behavior online to novel terrains, to a crippled leg, and in highly-dynamic environments.