人人操人人莫人人草_甜味弥漫一区二区在线观看_日韩欧美一区二区一_人人操人人爽人人操人人爽人人操人人爽_男人J桶女人屁免费视频网站_青青青久热国产在线观看_好男人视频社区视频在线播放

Jointly achieving safety and efficiency in human-robot interaction (HRI) settings is a challenging problem, as the robot's planning objectives may be at odds with the human's own intent and expectations. Recent approaches ensure safe robot operation in uncertain environments through a supervisory control scheme, sometimes called "shielding", which overrides the robot's nominal plan with a safety fallback strategy when a safety-critical event is imminent. These reactive "last-resort" strategies (typically in the form of aggressive emergency maneuvers) focus on preserving safety without efficiency considerations; when the nominal planner is unaware of possible safety overrides, shielding can be activated more frequently than necessary, leading to degraded performance. In this work, we propose a new shielding-based planning approach that allows the robot to plan efficiently by explicitly accounting for possible future shielding events. Leveraging recent work on Bayesian human motion prediction, the resulting robot policy proactively balances nominal performance with the risk of high-cost emergency maneuvers triggered by low-probability human behaviors. We formalize Shielding-Aware Robust Planning (SHARP) as a stochastic optimal control problem and propose a computationally efficient framework for finding tractable approximate solutions at runtime. Our method outperforms the shielding-agnostic motion planning baseline (equipped with the same human intent inference scheme) on simulated driving examples with human trajectories taken from the recently released Waymo Open Motion Dataset.

相關內容

INTERACT

關注 5

IFIP TC13 Conference on Human-Computer Interaction是人機交互領域的研究者和實踐者展示其工作的重要平臺。多年來，這些會議吸引了來自幾個國家和文化的研究人員。官網鏈接： · 噪聲 · Processing（編程語言） · 卡爾曼濾波 · 估計/估計量 ·

2021 年 11 月 26 日

Filter-Based Abstractions for Safe Planning of Partially Observable Dynamical Systems

Thom S. Badings,Nils Jansen,Hasan A. Poonawala,Marielle Stoelinga

We study planning problems for dynamical systems with uncertainty caused by measurement and process noise. Measurement noise causes limited observability of system states, and process noise causes uncertainty in the outcome of a given control. The problem is to find a controller that guarantees that the system reaches a desired goal state in finite time while avoiding obstacles, with at least some required probability. Due to the noise, this problem does not admit exact algorithmic or closed-form solutions in general. Our key contribution is a novel planning scheme that employs Kalman filtering as a state estimator to obtain a finite-state abstraction of the dynamical system, which we formalize as a Markov decision process (MDP). By extending this MDP with intervals of probabilities, we enhance the robustness of the model against numerical imprecision in approximating the transition probabilities. For this so-called interval MDP (iMDP), we employ state-of-the-art verification techniques to efficiently compute plans that maximize the probability of reaching goal states. We show the correctness of the abstraction and provide several optimizations that aim to balance the quality of the plan and the scalability of the approach. We demonstrate that our method is able to handle systems with a 6-dimensional state that result in iMDPs with tens of thousands of states and millions of transitions.

情景 · 流 · 學成 · Continuity · INFORMS ·

2021 年 11 月 25 日

Learning to Search in Task and Motion Planning with Streams

Mohamed Khodeir,Ben Agro,Florian Shkurti

Task and motion planning problems in robotics typically combine symbolic planning over discrete task variables with motion optimization over continuous state and action variables, resulting in trajectories that satisfy the logical constraints imposed on the task variables. Symbolic planning can scale exponentially with the number of task variables, so recent works such as PDDLStream have focused on optimistic planning with an incrementally growing set of objects and facts until a feasible trajectory is found. However, this set is exhaustively and uniformly expanded in a breadth-first manner, regardless of the geometric structure of the problem at hand, which makes long-horizon reasoning with large numbers of objects prohibitively time-consuming. To address this issue, we propose a geometrically informed symbolic planner that expands the set of objects and facts in a best-first manner, prioritized by a Graph Neural Network based score that is learned from prior search computations. We evaluate our approach on a diverse set of problems and demonstrate an improved ability to plan in large or difficult scenarios. We also apply our algorithm on a 7DOF robotic arm in several block-stacking manipulation tasks.

Guidance · Performer · Extensibility · 路徑 · state-of-the-art ·

2021 年 2 月 8 日

Path Planning using Neural A* Search

Ryo Yonetani,Tatsunori Taniai,Mohammadamin Barekatain,Mai Nishimura,Asako Kanezaki

We present Neural A*, a novel data-driven search method for path planning problems. Despite the recent increasing attention to data-driven path planning, a machine learning approach to search-based planning is still challenging due to the discrete nature of search algorithms. In this work, we reformulate a canonical A* search algorithm to be differentiable and couple it with a convolutional encoder to form an end-to-end trainable neural network planner. Neural A* solves a path planning problem by encoding a problem instance to a guidance map and then performing the differentiable A* search with the guidance map. By learning to match the search results with ground-truth paths provided by experts, Neural A* can produce a path consistent with the ground truth accurately and efficiently. Our extensive experiments confirmed that Neural A* outperformed state-of-the-art data-driven planners in terms of the search optimality and efficiency trade-off, and furthermore, successfully predicted realistic human trajectories by directly performing search-based planning on natural image inputs.

學成 · 可辨認的 · Performer · 估計/估計量 · state-of-the-art ·

2020 年 6 月 29 日

Retro*: Learning Retrosynthetic Planning with Neural Guided A* Search

Binghong Chen,Chengtao Li,Hanjun Dai,Le Song

from arxiv, Presented at ICML 2020

Retrosynthetic planning is a critical task in organic chemistry which identifies a series of reactions that can lead to the synthesis of a target product. The vast number of possible chemical transformations makes the size of the search space very big, and retrosynthetic planning is challenging even for experienced chemists. However, existing methods either require expensive return estimation by rollout with high variance, or optimize for search speed rather than the quality. In this paper, we propose Retro*, a neural-based A*-like algorithm that finds high-quality synthetic routes efficiently. It maintains the search as an AND-OR tree, and learns a neural search bias with off-policy data. Then guided by this neural network, it performs best-first search efficiently during new planning episodes. Experiments on benchmark USPTO datasets show that, our proposed method outperforms existing state-of-the-art with respect to both the success rate and solution quality, while being more efficient at the same time.

估計/估計量 · 狀態估計 · Performer · 穩健性 · Extensibility ·

2019 年 8 月 22 日

R-LINS: A Robocentric Lidar-Inertial State Estimator for Robust and Efficient Navigation

Chao Qin,Haoyang Ye,Christian E. Pranata,Jun Han,Shuyang Zhang,Ming Liu

We present R-LINS, a lightweight robocentric lidar-inertial state estimator, which estimates robot ego-motion using a 6-axis IMU and a 3D lidar in a tightly-coupled scheme. To achieve robustness and computational efficiency even in challenging environments, an iterated error-state Kalman filter (ESKF) is designed, which recursively corrects the state via repeatedly generating new corresponding feature pairs. Moreover, a novel robocentric formulation is adopted in which we reformulate the state estimator concerning a moving local frame, rather than a fixed global frame as in the standard world-centric lidar-inertial odometry(LIO), in order to prevent filter divergence and lower computational cost. To validate generalizability and long-time practicability, extensive experiments are performed in indoor and outdoor scenarios. The results indicate that R-LINS outperforms lidar-only and loosely-coupled algorithms, and achieve competitive performance as the state-of-the-art LIO with close to an order-of-magnitude improvement in terms of speed.

Extensibility · 學成 · 控制器 · 強化學習 · 確定性策略 ·

2018 年 7 月 10 日

CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving

Xiaodan Liang,Tairui Wang,Luona Yang,Eric Xing

from arxiv, To appear in ECCV 2018

Autonomous urban driving navigation with complex multi-agent dynamics is under-explored due to the difficulty of learning an optimal driving policy. The traditional modular pipeline heavily relies on hand-designed rules and the pre-processing perception system while the supervised learning-based models are limited by the accessibility of extensive human experience. We present a general and principled Controllable Imitative Reinforcement Learning (CIRL) approach which successfully makes the driving agent achieve higher success rates based on only vision inputs in a high-fidelity car simulator. To alleviate the low exploration efficiency for large continuous action space that often prohibits the use of classical RL on challenging real tasks, our CIRL explores over a reasonably constrained action space guided by encoded experiences that imitate human demonstrations, building upon Deep Deterministic Policy Gradient (DDPG). Moreover, we propose to specialize adaptive policies and steering-angle reward designs for different control signals (i.e. follow, straight, turn right, turn left) based on the shared representations to improve the model capability in tackling with diverse cases. Extensive experiments on CARLA driving benchmark demonstrate that CIRL substantially outperforms all previous methods in terms of the percentage of successfully completed episodes on a variety of goal-directed driving tasks. We also show its superior generalization capability in unseen environments. To our knowledge, this is the first successful case of the learned driving policy through reinforcement learning in the high-fidelity simulator, which performs better-than supervised imitation learning.

INTERACT · 學成 · Neural Networks · Networking · 控制器 ·

2018 年 4 月 23 日

Neural Network Based Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction

Stéphane Lathuilière,Benoit Massé,Pablo Mesejo,Radu Horaud

from arxiv, Paper submitted to Pattern Recognition Letters

This paper introduces a novel neural network-based reinforcement learning approach for robot gaze control. Our approach enables a robot to learn and to adapt its gaze control strategy for human-robot interaction neither with the use of external sensors nor with human supervision. The robot learns to focus its attention onto groups of people from its own audio-visual experiences, independently of the number of people, of their positions and of their physical appearances. In particular, we use a recurrent neural network architecture in combination with Q-learning to find an optimal action-selection policy; we pre-train the network using a simulated environment that mimics realistic scenarios that involve speaking/silent participants, thus avoiding the need of tedious sessions of a robot interacting with people. Our experimental evaluation suggests that the proposed method is robust against parameter estimation, i.e. the parameter values yielded by the method do not have a decisive impact on the performance. The best results are obtained when both audio and visual information is jointly used. Experiments with the Nao robot indicate that our framework is a step forward towards the autonomous learning of socially acceptable gaze behavior.

任務對話系統 · 學成 · INTERACT · 端到端 · 強化學習 ·

2018 年 4 月 18 日

Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems

Bing Liu,Gokhan Tur,Dilek Hakkani-Tur,Pararth Shah,Larry Heck

from arxiv, To appear in NAACL 2018 as a long paper

In this work, we present a hybrid learning method for training task-oriented dialogue systems through online user interactions. Popular methods for learning task-oriented dialogues include applying reinforcement learning with user feedback on supervised pre-training models. Efficiency of such learning method may suffer from the mismatch of dialogue state distribution between offline training and online interactive learning stages. To address this challenge, we propose a hybrid imitation and reinforcement learning method, with which a dialogue agent can effectively learn from its interaction with users by learning from human teaching and feedback. We design a neural network based task-oriented dialogue agent that can be optimized end-to-end with the proposed learning method. Experimental results show that our end-to-end dialogue agent can learn effectively from the mistake it makes via imitation learning from user teaching. Applying reinforcement learning with user feedback after the imitation learning stage further improves the agent's capability in successfully completing a task.

INTERACT · Extensibility · 推薦系統 · 估計/估計量 · MoDELS ·

2018 年 3 月 28 日

Human Interaction with Recommendation Systems

Sven Schmit,Carlos Riquelme

from arxiv, Accepted to AISTATS 2018

Many recommendation algorithms rely on user data to generate recommendations. However, these recommendations also affect the data obtained from future users. This work aims to understand the effects of this dynamic interaction. We propose a simple model where users with heterogeneous preferences arrive over time. Based on this model, we prove that naive estimators, i.e. those which ignore this feedback loop, are not consistent. We show that consistent estimators are efficient in the presence of myopic agents. Our results are validated using extensive simulations.

INFORMS · Performer · 穩健性 · CASES · FPS ·

2018 年 2 月 8 日

Saliency-Enhanced Robust Visual Tracking

Caglar Aytekin,Francesco Cricri,Emre Aksu

from arxiv, Submitted to ICIP 2018

Discrete correlation filter (DCF) based trackers have shown considerable success in visual object tracking. These trackers often make use of low to mid level features such as histogram of gradients (HoG) and mid-layer activations from convolution neural networks (CNNs). We argue that including semantically higher level information to the tracked features may provide further robustness to challenging cases such as viewpoint changes. Deep salient object detection is one example of such high level features, as it make use of semantic information to highlight the important regions in the given scene. In this work, we propose an improvement over DCF based trackers by combining saliency based and other features based filter responses. This combination is performed with an adaptive weight on the saliency based filter responses, which is automatically selected according to the temporal consistency of visual saliency. We show that our method consistently improves a baseline DCF based tracker especially in challenging cases and performs superior to the state-of-the-art. Our improved tracker operates at 9.3 fps, introducing a small computational burden over the baseline which operates at 11 fps.