久久久久久久精品少妇9999,亚洲中文字幕久久久久,国产精品国产三级国产有无,黄页视频网站野外

We propose a hybrid combination of active inference and behavior trees (BTs) for reactive action planning and execution in dynamic environments, showing how robotic tasks can be formulated as a free-energy minimization problem. The proposed approach allows handling partially observable initial states and improves the robustness of classical BTs against unexpected contingencies while at the same time reducing the number of nodes in a tree. In this work, we specify the nominal behavior offline, through BTs. However, in contrast to previous approaches, we introduce a new type of leaf node to specify the desired state to be achieved rather than an action to execute. The decision of which action to execute to reach the desired state is performed online through active inference. This results in continual online planning and hierarchical deliberation. By doing so, an agent can follow a predefined offline plan while still keeping the ability to locally adapt and take autonomous decisions at runtime, respecting safety constraints. We provide proof of convergence and robustness analysis, and we validate our method in two different mobile manipulators performing similar tasks, both in a simulated and real retail environment. The results showed improved runtime adaptability with a fraction of the hand-coded nodes compared to classical BTs.

相關內容

推斷

關注 5

回合 · Agent · INTERACT · Learning · MoDELS ·

2023 年 1 月 27 日

Tree-structured Policy Planning with Learned Behavior Models

Yuxiao Chen,Peter Karkus,Boris Ivanovic,Xinshuo Weng,Marco Pavone

Autonomous vehicles (AVs) need to reason about the multimodal behavior of neighboring agents while planning their own motion. Many existing trajectory planners seek a single trajectory that performs well under \emph{all} plausible futures simultaneously, ignoring bi-directional interactions and thus leading to overly conservative plans. Policy planning, whereby the ego agent plans a policy that reacts to the environment's multimodal behavior, is a promising direction as it can account for the action-reaction interactions between the AV and the environment. However, most existing policy planners do not scale to the complexity of real autonomous vehicle applications: they are either not compatible with modern deep learning prediction models, not interpretable, or not able to generate high quality trajectories. To fill this gap, we propose Tree Policy Planning (TPP), a policy planner that is compatible with state-of-the-art deep learning prediction models, generates multistage motion plans, and accounts for the influence of ego agent on the environment behavior. The key idea of TPP is to reduce the continuous optimization problem into a tractable discrete MDP through the construction of two tree structures: an ego trajectory tree for ego trajectory options, and a scenario tree for multi-modal ego-conditioned environment predictions. We demonstrate the efficacy of TPP in closed-loop simulations based on real-world nuScenes dataset and results show that TPP scales to realistic AV scenarios and significantly outperforms non-policy baselines.

Learning · 操作 · 回合 · Continuity · MoDELS ·

2023 年 1 月 26 日

Overcoming the Pitfalls of Prediction Error in Operator Learning for Bilevel Planning

Nishanth Kumar,Willie McClinton,Rohan Chitnis,Tom Silver,Tomás Lozano-Pérez,Leslie Pack Kaelbling

Bilevel planning, in which a high-level search over an abstraction of an environment is used to guide low-level decision-making, is an effective approach to solving long-horizon tasks in continuous state and action spaces. Recent work has shown how to enable such bilevel planning by learning action and transition model abstractions in the form of symbolic operators and neural samplers. In this work, we show that existing symbolic operator learning approaches fall short in many natural environments where agent actions tend to cause a large number of irrelevant propositions to change. This is primarily because they attempt to learn operators that optimize the prediction error with respect to observed changes in the propositions. To overcome this issue, we propose to learn operators that only model changes necessary for abstract planning to achieve the specified goal. Experimentally, we show that our approach learns operators that lead to efficient planning across 10 different hybrid robotics domains, including 4 from the challenging BEHAVIOR-100 benchmark, with generalization to novel initial states, goals, and objects.

Networking · 泛函 · 知識 (knowledge) · 機器人 · 表示 ·

2023 年 1 月 26 日

Long-Horizon Task Planning and Execution with Functional Object-Oriented Networks

David Paulius,Alejandro Agostini,Dongheui Lee

from arxiv, Preliminary Draft, 8 pages, IEEE Conference Format

Following work on joint object-action representation, functional object-oriented networks (FOON) were introduced as a knowledge representation for robots. A FOON contains symbolic (high-level) concepts useful to a robot's understanding of tasks and its environment for object-level planning. Prior to this work, little has been done to show how plans acquired from FOON can be executed by a robot, as the concepts in a FOON are too abstract for immediate execution. We propose a hierarchical task planning approach that translates a FOON graph into a PDDL-based representation of domain knowledge for task planning and execution. As a result of this process, a task plan can be acquired, which can be executed by a robot from start to end, leveraging the use of action contexts and skills as dynamic movement primitives (DMPs). We demonstrate the entire pipeline from planning to execution using CoppeliaSim and show how learned action contexts can be extended to never-before-seen scenarios.

秩 · Learning · 線性的 · 在線 · 賭博機/老虎機 ·

2023 年 1 月 25 日

Overcoming Prior Misspecification in Online Learning to Rank

Javad Azizi,Ofer Meshi,Masrour Zoghi,Maryam Karimzadehgan

The recent literature on online learning to rank (LTR) has established the utility of prior knowledge to Bayesian ranking bandit algorithms. However, a major limitation of existing work is the requirement for the prior used by the algorithm to match the true prior. In this paper, we propose and analyze adaptive algorithms that address this issue and additionally extend these results to the linear and generalized linear models. We also consider scalar relevance feedback on top of click feedback. Moreover, we demonstrate the efficacy of our algorithms using both synthetic and real-world experiments.

Performer · Learning · INFORMS · 邊 · 極大 ·

2023 年 1 月 25 日

Two-Stage Coded Federated Edge Learning: A Dynamic Partial Gradient Coding Perspective

Xinghan Wang,Xiaoxiong Zhong,Jiahong Ning,Hangfan Li,Tingting Yang,Yuanyuan Yang

from arxiv, improve some simulations

Federated edge learning (FEL) can training a global model from terminal nodes' local dataset, which can make full use of the computing resources of terminal nodes and performs more extensive and efficient machine learning on terminal nodes with protecting user information requirements. Performance of FEL will be suffered from long delay or fault decision as the master collects partial gradients from stragglers which cannot return correct results within a deadline. Inspired by this, in this paper, we propose a novel coded FEL to mitigate stragglers for synchronous gradient with a two-stage dynamic scheme, where we start with part of workers for a duration of before starting the second stage, and on completion of at the first stage, we start remaining workers in the second stage. In particular, the computation latency and transmission latency is essential and should be quantitatively analyzed. Then the dynamically coded coefficients scheme is proposed which is based on historical information including worker completion time. For performance optimization of FEL, a Lyapunov function is designed to maximize admission data balancing fairness and two stage dynamic coding scheme is designed to maximize arrival data among workers. Experimental evidence verifies the derived properties and demonstrates that our proposed solution achieves a better performance for practical network parameters and benchmark datasets in terms of accuracy and resource utilization in the FEL system.

逆強化學習 · Learning · 泛函 · 機器人 · Integration ·

2023 年 1 月 25 日

An Incremental Inverse Reinforcement Learning Approach for Motion Planning with Human Preferences

Armin Avaei,Linda van der Spaa,Luka Peternel,Jens Kober

from arxiv, 13 pages, 12 figures, 2 tables, associated video: //youtu.be/hhL5-Lpzj4M

Humans often demonstrate diverse behaviors due to their personal preferences, for instance related to their individual execution style or personal margin for safety. In this paper, we consider the problem of integrating such preferences into trajectory planning for robotic manipulators. We first learn reward functions that represent the user path and motion preferences from kinesthetic demonstration. We then use a discrete-time trajectory optimization scheme to produce trajectories that adhere to both task requirements and user preferences. We go beyond the state of art by designing a feature set that captures the fundamental preferences in a manipulation task, such as timing of the motion. We further demonstrate that our method is capable of generalizing such preferences to new scenarios. We implement our algorithm on a Franka Emika 7-DoF robot arm, and validate the functionality and flexibility of our approach in a user study. The results show that non-expert users are able to teach the robot their preferences with just a few iterations of feedback.

2023 年 1 月 25 日

Robot Subset Selection for Swarm Lifetime Maximization in Computation Offloading with Correlated Data Sources

Siqi Zhang,Na Yi,Yi Ma

from arxiv, 7 pages, 3 figures, ICC 2023

Consider robot swarm wireless networks where mobile robots offload their computing tasks to a computing server located at the mobile edge. Our aim is to maximize the swarm lifetime through efficient exploitation of the correlation between distributed data sources. The optimization problem is handled by selecting appropriate robot subsets to send their sensed data to the server. In this work, the data correlation between distributed robot subsets is modelled as an undirected graph. A least-degree iterative partitioning (LDIP) algorithm is proposed to partition the graph into a set of subgraphs. Each subgraph has at least one vertex (i.e., subset), termed representative vertex (R-Vertex), which shares edges with and only with all other vertices within the subgraph; only R-Vertices are selected for data transmissions. When the number of subgraphs is maximized, the proposed subset selection approach is shown to be optimum in the AWGN channel. For independent fading channels, the max-min principle can be incorporated into the proposed approach to achieve the best performance.

Continuity · INTERACT · MoDELS · AVS · Automator ·

2023 年 1 月 25 日

Deconstructing Pedestrian Crossing Decision-making in Interactions with Continuous Traffic: an Anthropomorphic Model

Kai Tian,Gustav Markkula,Chongfeng Wei,Yee Mun Lee,Ruth Madigan,Toshiya Hirose,Natasha Merat,Richard Romano

As safe and comfortable interactions with pedestrians could contribute to automated vehicles' (AVs) social acceptance and scale, increasing attention has been drawn to computational pedestrian behavior models. However, very limited studies characterize pedestrian crossing behavior based on specific behavioral mechanisms, as those mechanisms underpinning pedestrian road behavior are not yet clear. Here, we reinterpret pedestrian crossing behavior based on a deconstructed crossing decision process at uncontrolled intersections with continuous traffic. Notably, we explain and model pedestrian crossing behavior as they wait for crossing opportunities, optimizing crossing decisions by comparing the visual collision risk of approaching vehicles around them. A collision risk-based crossing initiation model is proposed to characterize the time-dynamic nature of pedestrian crossing decisions. A simulation tool is established to reproduce pedestrian behavior by employing the proposed model and a social force model. Two datasets collected in a CAVE-based immersive pedestrian simulator are applied to calibrate and validate the model. The model predicts pedestrian crossing decisions across all traffic scenarios well. In particular, by considering the decision strategy that pedestrians compare the collision risk of surrounding traffic gaps, model performance is significantly improved. Moreover, the collision risk-based crossing initiation model accurately captures the timing of pedestrian crossing initiations within each gap. This work concisely demonstrates how pedestrians dynamically adapt their crossings in continuous traffic based on perceived collision risk, potentially providing insights into modeling coupled human-AV interactions or serving as a tool to realize human-like pedestrian road behavior in virtual AVs test platforms.

回合 · 優化器 · 峰值 · 路徑 · INTERACT ·

2023 年 1 月 25 日

Planning-assisted autonomous swarm shepherding with collision avoidance

Jing Liu,Hemant Singh,Saber Elsayed,Robert Hunjet,Hussein Abbass

from arxiv, 12 pages, 3 figures

Robotic shepherding is a bio-inspired approach to autonomously guiding a swarm of agents towards a desired location and has earned increasing research interest recently. However, shepherding a highly dispersed swarm in an obstructive environment remains challenging for the existing methods. To improve the shepherding efficacy in complex environments with obstacles and dispersed sheep, this paper proposes a planning-assisted autonomous shepherding framework with collision avoidance. The proposed approach transforms the swarm shepherding problem into a single Travelling Salesman Problem (TSP), with the sheepdog moving mode classified into non-interaction and interaction mode. Additionally, an adaptive switching approach is integrated into the framework to guide real-time path planning for avoiding collisions with obstacles and sometimes with sheep swarm. Then the overarching hierarchical mission planning system is presented, which consists of a grouping approach to obtain sheep sub-swarms, a general TSP solver for determining the optimal push sequence of sub-swarms, and an online path planner for calculating optimal paths for both sheepdogs and sheep. The experiments on a range of environments, both with and without obstacles, quantitatively demonstrate the effectiveness of the proposed shepherding framework and planning approaches.

學成 · 替代損失 · 在線 · Bandits · 賭博機/老虎機 ·

2019 年 12 月 31 日

A Modern Introduction to Online Learning

Francesco Orabona

In this monograph, I introduce the basic concepts of Online Learning through a modern view of Online Convex Optimization. Here, online learning refers to the framework of regret minimization under worst-case assumptions. I present first-order and second-order algorithms for online learning with convex losses, in Euclidean and non-Euclidean settings. All the algorithms are clearly presented as instantiation of Online Mirror Descent or Follow-The-Regularized-Leader and their variants. Particular attention is given to the issue of tuning the parameters of the algorithms and learning in unbounded domains, through adaptive and parameter-free online learning algorithms. Non-convex losses are dealt through convex surrogate losses and through randomization. The bandit setting is also briefly discussed, touching on the problem of adversarial and stochastic multi-armed bandits. These notes do not require prior knowledge of convex analysis and all the required mathematical tools are rigorously explained. Moreover, all the proofs have been carefully chosen to be as simple and as short as possible.