黄片一级在线视频播放,国产成人无码午夜视频在线观看,久久亚洲国产精品日日AV夜夜,九九RE久久这里有精品,自拍偷区亚洲综合照片

This study proposes a hierarchically integrated framework for safe task and motion planning (TAMP) of bipedal locomotion in a partially observable environment with dynamic obstacles and uneven terrain. The high-level task planner employs linear temporal logic (LTL) for a reactive game synthesis between the robot and its environment and provides a formal guarantee on navigation safety and task completion. To address environmental partial observability, a belief abstraction is employed at the high-level navigation planner to estimate the dynamic obstacles' location. Accordingly, a synthesized action planner sends a set of locomotion actions to the middle-level motion planner, while incorporating safe locomotion specifications extracted from safety theorems based on a reduced-order model (ROM) of the locomotion process. The motion planner employs the ROM to design safety criteria and a sampling algorithm to generate non-periodic motion plans that accurately track high-level actions. At the low level, a foot placement controller based on an angular-momentum linear inverted pendulum model is implemented and integrated with an ankle-actuated passivity-based controller for full-body trajectory tracking. To address external perturbations, this study also investigates safe sequential composition of the keyframe locomotion state and achieves robust transitions against external perturbations through reachability analysis. The overall TAMP framework is validated with extensive simulations and hardware experiments on bipedal walking robots Cassie and Digit designed by Agility Robotics.

相關內容

回合

關注 3

Learning · MoDELS · 聯邦學習 · Performer · 情景 ·

2023 年 4 月 28 日

Client Recruitment for Federated Learning in ICU Length of Stay Prediction

Vincent Scheltjens,Lyse Naomi Wamba Momo,Wouter Verbeke,Bart De Moor

Machine and deep learning methods for medical and healthcare applications have shown significant progress and performance improvement in recent years. These methods require vast amounts of training data which are available in the medical sector, albeit decentralized. Medical institutions generate vast amounts of data for which sharing and centralizing remains a challenge as the result of data and privacy regulations. The federated learning technique is well-suited to tackle these challenges. However, federated learning comes with a new set of open problems related to communication overhead, efficient parameter aggregation, client selection strategies and more. In this work, we address the step prior to the initiation of a federated network for model training, client recruitment. By intelligently recruiting clients, communication overhead and overall cost of training can be reduced without sacrificing predictive performance. Client recruitment aims at pre-excluding potential clients from partaking in the federation based on a set of criteria indicative of their eventual contributions to the federation. In this work, we propose a client recruitment approach using only the output distribution and sample size at the client site. We show how a subset of clients can be recruited without sacrificing model performance whilst, at the same time, significantly improving computation time. By applying the recruitment approach to the training of federated models for accurate patient Length of Stay prediction using data from 189 Intensive Care Units, we show how the models trained in federations made up from recruited clients significantly outperform federated models trained with the standard procedure in terms of predictive power and training time.

語言模型化 · MoDELS · Extensibility · INFORMS · AIM ·

2023 年 4 月 27 日

Pretrained Language Models as Visual Planners for Human Assistance

Dhruvesh Patel,Hamid Eghbalzadeh,Nitin Kamra,Michael Louis Iuzzolino,Unnat Jain,Ruta Desai

from arxiv, No change in the main text; added acknowledgements

To make progress towards multi-modal AI assistants which can guide users to achieve complex multi-step goals, we propose the task of Visual Planning for Assistance (VPA). Given a goal briefly described in natural language, e.g., "make a shelf", and a video of the user's progress so far, the aim of VPA is to obtain a plan, i.e., a sequence of actions such as "sand shelf", "paint shelf", etc., to achieve the goal. This requires assessing the user's progress from the untrimmed video, and relating it to the requirements of underlying goal, i.e., relevance of actions and ordering dependencies amongst them. Consequently, this requires handling long video history, and arbitrarily complex action dependencies. To address these challenges, we decompose VPA into video action segmentation and forecasting. We formulate the forecasting step as a multi-modal sequence modeling problem and present Visual Language Model based Planner (VLaMP), which leverages pre-trained LMs as the sequence model. We demonstrate that VLaMP performs significantly better than baselines w.r.t all metrics that evaluate the generated plan. Moreover, through extensive ablations, we also isolate the value of language pre-training, visual observations, and goal information on the performance. We will release our data, model, and code to enable future research on visual planning for assistance.

Learning · 強化學習 · 可辨認的 · Integration · Performance ·

2023 年 4 月 27 日

DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems

Pierre Schumacher,Daniel H?ufle,Dieter Büchler,Syn Schmitt,Georg Martius

Muscle-actuated organisms are capable of learning an unparalleled diversity of dexterous movements despite their vast amount of muscles. Reinforcement learning (RL) on large musculoskeletal models, however, has not been able to show similar performance. We conjecture that ineffective exploration in large overactuated action spaces is a key problem. This is supported by the finding that common exploration noise strategies are inadequate in synthetic examples of overactuated systems. We identify differential extrinsic plasticity (DEP), a method from the domain of self-organization, as being able to induce state-space covering exploration within seconds of interaction. By integrating DEP into RL, we achieve fast learning of reaching and locomotion in musculoskeletal systems, outperforming current approaches in all considered tasks in sample efficiency and robustness.

控制器 · INTERACT · 穩健性 · PID · Performer ·

2023 年 4 月 27 日

Physical Human-Robot Interaction Control of an Upper Limb Exoskeleton with a Decentralized Neuro-Adaptive Control Scheme

Mahdi Hejrati,Jouni Mattila

Within the concept of physical human-robot interaction (pHRI), the most important criterion is the safety of the human operator interacting with a high degree of freedom (DoF) robot. Therefore, a robust control scheme is in high demand to establish safe pHRI and stabilize nonlinear, high DoF systems. In this paper, an adaptive decentralized control strategy is designed to accomplish the abovementioned objectives. To do so, a human upper limb model and an exoskeleton model are decentralized and augmented at the subsystem level to enable a decentralized control action design. Moreover, human exogenous force (HEF) that can resist exoskeleton motion is estimated using radial basis function neural networks (RBFNNs). Estimating both human upper limb and robot rigid body parameters, along with HEF estimation, makes the controller adaptable to different operators, ensuring their physical safety. The barrier Lyapunov function (BLF) is employed to guarantee that the robot can operate in a safe workspace while ensuring stability by adjusting the control law. Unknown actuator uncertainty and constraints are also considered in this study to ensure a smooth and safe pHRI. Then, the asymptotic stability of the whole system is established by means of the virtual stability concept and virtual power flows (VPFs) under the proposed robust controller. The experimental results are presented and compared to proportional-derivative (PD) and proportional-integral-derivative (PID) controllers. To show the robustness of the designed controller and its good performance, experiments are performed at different velocities, with different human users, and in the presence of unknown disturbances. The proposed controller showed perfect performance in controlling the robot, whereas PD and PID controllers could not even ensure stable motion in the wrist joints of the robot.

TEAM · 優化器 · 機器人 · 在線 · AIM ·

2023 年 4 月 27 日

A Distributed Online Optimization Strategy for Cooperative Robotic Surveillance

Lorenzo Pichierri,Guido Carnevale,Lorenzo Sforni,Andrea Testa,Giuseppe Notarstefano

In this paper, we propose a distributed algorithm to control a team of cooperating robots aiming to protect a target from a set of intruders. Specifically, we model the strategy of the defending team by means of an online optimization problem inspired by the emerging distributed aggregative framework. In particular, each defending robot determines its own position depending on (i) the relative position between an associated intruder and the target, (ii) its contribution to the barycenter of the team, and (iii) collisions to avoid with its teammates. We highlight that each agent is only aware of local, noisy measurements about the location of the associated intruder and the target. Thus, in each robot, our algorithm needs to (i) locally reconstruct global unavailable quantities and (ii) predict its current objective functions starting from the local measurements. The effectiveness of the proposed methodology is corroborated by simulations and experiments on a team of cooperating quadrotors.

正交 · 離散化 · 近似 · Performer · 論文 ·

2023 年 4 月 26 日

A two level approach for simulating Bose-Einstein condensates by localized orthogonal decomposition

Christian D?ding,Patrick Henning,Johan W?rneg?rd

In this work, we consider the numerical computation of ground states and dynamics of single-component Bose-Einstein condensates (BECs). The corresponding models are spatially discretized with a multiscale finite element approach known as Localized Orthogonal Decomposition (LOD). Despite the outstanding approximation properties of such a discretization in the context of BECs, taking full advantage of it without creating severe computational bottlenecks can be tricky. In this paper, we therefore present two fully-discrete numerical approaches that are formulated in such a way that they take special account of the structure of the LOD spaces. One approach is devoted to the computation of ground states and another one for the computation of dynamics. A central focus of this paper is also the discussion of implementation aspects that are very important for the practical realization of the methods. In particular, we discuss the use of suitable data structures that keep the memory costs economical. The paper concludes with various numerical experiments in 1d, 2d and 3d that investigate convergence rates and approximation properties of the methods and which demonstrate their performance and computational efficiency, also in comparison to spectral and standard finite element approaches.

Learning · MoDELS · Agent · Performer · 真實值 ·

2023 年 4 月 26 日

Predictive World Models from Real-World Partial Observations

Robin Karlsson,Alexander Carballo,Keisuke Fujii,Kento Ohtani,Kazuya Takeda

from arxiv, Accepted for IEEE MOST 2023

Cognitive scientists believe adaptable intelligent agents like humans perform reasoning through learned causal mental simulations of agents and environments. The problem of learning such simulations is called predictive world modeling. Recently, reinforcement learning (RL) agents leveraging world models have achieved SOTA performance in game environments. However, understanding how to apply the world modeling approach in complex real-world environments relevant to mobile robots remains an open question. In this paper, we present a framework for learning a probabilistic predictive world model for real-world road environments. We implement the model using a hierarchical VAE (HVAE) capable of predicting a diverse set of fully observed plausible worlds from accumulated sensor observations. While prior HVAE methods require complete states as ground truth for learning, we present a novel sequential training method to allow HVAEs to learn to predict complete states from partially observed states only. We experimentally demonstrate accurate spatial structure prediction of deterministic regions achieving 96.21 IoU, and close the gap to perfect prediction by 62% for stochastic regions using the best prediction. By extending HVAEs to cases where complete ground truth states do not exist, we facilitate continual learning of spatial prediction as a step towards realizing explainable and comprehensive predictive world models for real-world mobile robotics applications. Code is available at //github.com/robin-karlsson0/predictive-world-models.

Learning · 不完美信息 · Agent · 強化學習 · Self-Play ·

2022 年 6 月 30 日

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

Julien Perolat,Bart de Vylder,Daniel Hennes,Eugene Tarassov,Florian Strub,Vincent de Boer,Paul Muller,Jerome T. Connor,Neil Burch,Thomas Anthony,Stephen McAleer,Romuald Elie,Sarah H. Cen,Zhe Wang,Audrunas Gruslys,Aleksandra Malysheva,Mina Khan,Sherjil Ozair,Finbarr Timbers,Toby Pohlen,Tom Eccles,Mark Rowland,Marc Lanctot,Jean-Baptiste Lespiau,Bilal Piot,Shayegan Omidshafiei,Edward Lockhart,Laurent Sifre,Nathalie Beauguerlange,Remi Munos,David Silver,Satinder Singh,Demis Hassabis,Karl Tuyls

We introduce DeepNash, an autonomous agent capable of learning to play the imperfect information game Stratego from scratch, up to a human expert level. Stratego is one of the few iconic board games that Artificial Intelligence (AI) has not yet mastered. This popular game has an enormous game tree on the order of $10^{535}$ nodes, i.e., $10^{175}$ times larger than that of Go. It has the additional complexity of requiring decision-making under imperfect information, similar to Texas hold'em poker, which has a significantly smaller game tree (on the order of $10^{164}$ nodes). Decisions in Stratego are made over a large number of discrete actions with no obvious link between action and outcome. Episodes are long, with often hundreds of moves before a player wins, and situations in Stratego can not easily be broken down into manageably-sized sub-problems as in poker. For these reasons, Stratego has been a grand challenge for the field of AI for decades, and existing AI methods barely reach an amateur level of play. DeepNash uses a game-theoretic, model-free deep reinforcement learning method, without search, that learns to master Stratego via self-play. The Regularised Nash Dynamics (R-NaD) algorithm, a key component of DeepNash, converges to an approximate Nash equilibrium, instead of 'cycling' around it, by directly modifying the underlying multi-agent learning dynamics. DeepNash beats existing state-of-the-art AI methods in Stratego and achieved a yearly (2022) and all-time top-3 rank on the Gravon games platform, competing with human expert players.

優化器 · INTERACT · Networking · 知識 (knowledge) · Performer ·

2022 年 5 月 11 日

Dynamic neighbourhood optimisation for task allocation using multi-agent

Niall Creech,Natalia Criado Pacheco,Simon Miles

from arxiv, 28 pages

In large-scale systems there are fundamental challenges when centralised techniques are used for task allocation. The number of interactions is limited by resource constraints such as on computation, storage, and network communication. We can increase scalability by implementing the system as a distributed task-allocation system, sharing tasks across many agents. However, this also increases the resource cost of communications and synchronisation, and is difficult to scale. In this paper we present four algorithms to solve these problems. The combination of these algorithms enable each agent to improve their task allocation strategy through reinforcement learning, while changing how much they explore the system in response to how optimal they believe their current strategy is, given their past experience. We focus on distributed agent systems where the agents' behaviours are constrained by resource usage limits, limiting agents to local rather than system-wide knowledge. We evaluate these algorithms in a simulated environment where agents are given a task composed of multiple subtasks that must be allocated to other agents with differing capabilities, to then carry out those tasks. We also simulate real-life system effects such as networking instability. Our solution is shown to solve the task allocation problem to 6.7% of the theoretical optimal within the system configurations considered. It provides 5x better performance recovery over no-knowledge retention approaches when system connectivity is impacted, and is tested against systems up to 100 agents with less than a 9% impact on the algorithms' performance.

可辨認的 · Extensibility · TEAM · 估計/估計量 · 納什均衡 ·

2021 年 9 月 15 日

Decentralized and Communication-Free Multi-Robot Navigation through Distributed Games

Brian Reily,Terran Mott,Hao Zhang

Effective multi-robot teams require the ability to move to goals in complex environments in order to address real-world applications such as search and rescue. Multi-robot teams should be able to operate in a completely decentralized manner, with individual robot team members being capable of acting without explicit communication between neighbors. In this paper, we propose a novel game theoretic model that enables decentralized and communication-free navigation to a goal position. Robots each play their own distributed game by estimating the behavior of their local teammates in order to identify behaviors that move them in the direction of the goal, while also avoiding obstacles and maintaining team cohesion without collisions. We prove theoretically that generated actions approach a Nash equilibrium, which also corresponds to an optimal strategy identified for each robot. We show through extensive simulations that our approach enables decentralized and communication-free navigation by a multi-robot system to a goal position, and is able to avoid obstacles and collisions, maintain connectivity, and respond robustly to sensor noise.