亚洲国产最新AV片-亚洲精品无码中出中文字幕

In general, optimal motion planning can be performed both locally and globally. In such a planning, the choice in favour of either local or global planning technique mainly depends on whether the environmental conditions are dynamic or static. Hence, the most adequate choice is to use local planning or local planning alongside global planning. When designing optimal motion planning both local and global, the key metrics to bear in mind are execution time, asymptotic optimality, and quick reaction to dynamic obstacles. Such planning approaches can address the aforesaid target metrics more efficiently compared to other approaches such as path planning followed by smoothing. Thus, the foremost objective of this study is to analyse related literature in order to understand how the motion planning, especially trajectory planning, problem is formulated, when being applied for generating optimal trajectories in real-time for Multirotor Aerial Vehicles (MAVs), impacts the listed metrics. As a result of the research, the trajectory planning problem was broken down into a set of subproblems, and the lists of methods for addressing each of the problems were identified and described in detail. Subsequently, the most prominent results from 2010 to 2022 were summarized and presented in the form of a timeline.

相關內容

優化器

關注 4

Networking · 控制器 · Q網絡` · Performer · 劃分 ·

2023 年 3 月 22 日

Multi-agent Reinforcement Learning for Regional Signal control in Large-scale Grid Traffic network

Hankang Gu,Shangbo Wang

Adaptive traffic signal control with Multi-agent Reinforcement Learning(MARL) is a very popular topic nowadays. In most existing novel methods, one agent controls single intersections and these methods focus on the cooperation between intersections. However, the non-stationary property of MARL still limits the performance of the above methods as the size of traffic networks grows. One compromised strategy is to assign one agent with a region of intersections to reduce the number of agents. There are two challenges in this strategy, one is how to partition a traffic network into small regions and the other is how to search for the optimal joint actions for a region of intersections. In this paper, we propose a novel training framework RegionLight where our region partition rule is based on the adjacency between the intersection and extended Branching Dueling Q-Network(BDQ) to Dynamic Branching Dueling Q-Network(DBDQ) to bound the growth of the size of joint action space and alleviate the bias introduced by imaginary intersections outside of the boundary of the traffic network. Our experiments on both real datasets and synthetic datasets demonstrate that our framework performs best among other novel frameworks and that our region partition rule is robust.

端到端 · state-of-the-art · 查準率/準確率 · 有向 · Attention ·

2023 年 3 月 21 日

Motion Planning for Autonomous Driving: The State of the Art and Perspectives

Siyu Teng,Xuemin Hu,Peng Deng,Bai Li,Yuchen Li,Zhe Xuanyuan,Dongsheng Yang,Yunfeng Ai,Lingxi Li,Long Chen,Fenghua Zhu

from arxiv, 20 pages, 14 figures and 5 tables

Thanks to the augmented convenience, safety advantages, and potential commercial value, Intelligent vehicles (IVs) have attracted wide attention throughout the world. Although a few of autonomous driving unicorns assert that IVs will be commercially deployable by 2025, their implementation is still restricted to small-scale validation due to various issues, among which precise computation of control commands or trajectories by planning methods remains a prerequisite for IVs. This paper aims to review state-of-the-art planning methods, including pipeline planning and end-to-end planning methods. In terms of pipeline methods, a survey of selecting algorithms is provided along with a discussion of the expansion and optimization mechanisms, whereas in end-to-end methods, the training approaches and verification scenarios of driving tasks are points of concern. Experimental platforms are reviewed to facilitate readers in selecting suitable training and validation methods. Finally, the current challenges and future directions are discussed. The side-by-side comparison presented in this survey not only helps to gain insights into the strengths and limitations of the reviewed methods but also assists with system-level design choices.

Learning · Agent · SimPLe · Boosting（一種模型訓練加速方式） · Performance ·

2023 年 3 月 20 日

Imitating Graph-Based Planning with Goal-Conditioned Policies

Junsu Kim,Younggyo Seo,Sungsoo Ahn,Kyunghwan Son,Jinwoo Shin

from arxiv, Accepted to ICLR 2023

Recently, graph-based planning algorithms have gained much attention to solve goal-conditioned reinforcement learning (RL) tasks: they provide a sequence of subgoals to reach the target-goal, and the agents learn to execute subgoal-conditioned policies. However, the sample-efficiency of such RL schemes still remains a challenge, particularly for long-horizon tasks. To address this issue, we present a simple yet effective self-imitation scheme which distills a subgoal-conditioned policy into the target-goal-conditioned policy. Our intuition here is that to reach a target-goal, an agent should pass through a subgoal, so target-goal- and subgoal- conditioned policies should be similar to each other. We also propose a novel scheme of stochastically skipping executed subgoals in a planned path, which further improves performance. Unlike prior methods that only utilize graph-based planning in an execution phase, our method transfers knowledge from a planner along with a graph into policy learning. We empirically show that our method can significantly boost the sample-efficiency of the existing goal-conditioned RL methods under various long-horizon control tasks.

估計/估計量 · Sphering · 正則化項 · 近似 · Analysis ·

2023 年 3 月 20 日

On pointwise error estimates for Vorono?-based finite volume methods for the Poisson equation on the sphere

Leonardo A. Poveda,Pedro Peixoto

In this paper, we give pointwise estimates of a Vorono\"i-based finite volume approximation of the Laplace-Beltrami operator on Vorono\"i-Delaunay decompositions of the sphere. These estimates are the basis for a local error analysis, in the maximum norm, of the approximate solution of the Poisson equation and its gradient. Here, we consider the Vorono\"i-based finite volume method as a perturbation of the finite element method. Finally, using regularized Green's functions, we derive quasi-optimal convergence order in the maximum-norm with minimal regularity requirements. Numerical examples show that the convergence is at least as good as predicted.

Automator · Learning · Performer · 回合 · 聯邦學習 ·

2023 年 3 月 19 日

A Survey of Federated Learning for Connected and Automated Vehicles

Vishnu Pandi Chellapandi,Liangqi Yuan,Stanislaw H /. Zak,Ziran Wang

from arxiv, 8 pages, 1 figure

Connected and Automated Vehicles (CAVs) are one of the emerging technologies in the automotive domain that has the potential to alleviate the issues of accidents, traffic congestion, and pollutant emissions, leading to a safe, efficient, and sustainable transportation system. Machine learning-based methods are widely used in CAVs for crucial tasks like perception, motion planning, and motion control, where machine learning models in CAVs are solely trained using the local vehicle data, and the performance is not certain when exposed to new environments or unseen conditions. Federated learning (FL) is an effective solution for CAVs that enables a collaborative model development with multiple vehicles in a distributed learning framework. FL enables CAVs to learn from a wide range of driving environments and improve their overall performance while ensuring the privacy and security of local vehicle data. In this paper, we review the progress accomplished by researchers in applying FL to CAVs. A broader view of the various data modalities and algorithms that have been implemented on CAVs is provided. Specific applications of FL are reviewed in detail, and an analysis of the challenges and future scope of research are presented.

Guidance · Agent · Learning · 模型評估 · 深度強化學習 ·

2023 年 3 月 17 日

Robust Adversarial Attacks Detection based on Explainable Deep Reinforcement Learning For UAV Guidance and Planning

Thomas Hickling,Nabil Aouf,Phillippa Spencer

from arxiv, 13 pages, 18 figures

The dangers of adversarial attacks on Uncrewed Aerial Vehicle (UAV) agents operating in public are increasing. Adopting AI-based techniques and, more specifically, Deep Learning (DL) approaches to control and guide these UAVs can be beneficial in terms of performance but can add concerns regarding the safety of those techniques and their vulnerability against adversarial attacks. Confusion in the agent's decision-making process caused by these attacks can seriously affect the safety of the UAV. This paper proposes an innovative approach based on the explainability of DL methods to build an efficient detector that will protect these DL schemes and the UAVs adopting them from attacks. The agent adopts a Deep Reinforcement Learning (DRL) scheme for guidance and planning. The agent is trained with a Deep Deterministic Policy Gradient (DDPG) with Prioritised Experience Replay (PER) DRL scheme that utilises Artificial Potential Field (APF) to improve training times and obstacle avoidance performance. A simulated environment for UAV explainable DRL-based planning and guidance, including obstacles and adversarial attacks, is built. The adversarial attacks are generated by the Basic Iterative Method (BIM) algorithm and reduced obstacle course completion rates from 97\% to 35\%. Two adversarial attack detectors are proposed to counter this reduction. The first one is a Convolutional Neural Network Adversarial Detector (CNN-AD), which achieves accuracy in the detection of 80\%. The second detector utilises a Long Short Term Memory (LSTM) network. It achieves an accuracy of 91\% with faster computing times compared to the CNN-AD, allowing for real-time adversarial detection.

策略迭代 · Agent · 控制器 · Learning · 值迭代 ·

2023 年 3 月 17 日

A Policy Iteration Approach for Flock Motion Control

Shuzheng Qu,Mohammed Abouheaf,Wail Gueaieb,Davide Spinello

from arxiv, 7 pages, 3 figures

The flocking motion control is concerned with managing the possible conflicts between local and team objectives of multi-agent systems. The overall control process guides the agents while monitoring the flock-cohesiveness and localization. The underlying mechanisms may degrade due to overlooking the unmodeled uncertainties associated with the flock dynamics and formation. On another side, the efficiencies of the various control designs rely on how quickly they can adapt to different dynamic situations in real-time. An online model-free policy iteration mechanism is developed here to guide a flock of agents to follow an independent command generator over a time-varying graph topology. The strength of connectivity between any two agents or the graph edge weight is decided using a position adjacency dependent function. An online recursive least squares approach is adopted to tune the guidance strategies without knowing the dynamics of the agents or those of the command generator. It is compared with another reinforcement learning approach from the literature which is based on a value iteration technique. The simulation results of the policy iteration mechanism revealed fast learning and convergence behaviors with less computational effort.

端到端 · 路徑 · state-of-the-art · 查準率/準確率 · 泛化理論 ·

2023 年 3 月 17 日

Path Planning for Autonomous Driving: The State of the Art and Perspectives

Siyu Teng,Peng Deng,Yuchen Li,Bai Li,Xuemin Hu,Zhe Xuanyuan,Long Chen,Yunfeng Ai,Lingxi Li,Fei-Yue Wang

from arxiv, 20 pages, 14 figures and 5 tables

Intelligent vehicles (IVs) have attracted wide attention thanks to the augmented convenience, safety advantages, and potential commercial value. Although a few of autonomous driving unicorns assert that IVs will be commercially deployable by 2025, their deployment is still restricted to small-scale validation due to various issues, among which safety, reliability, and generalization of planning methods are prominent concerns. Precise computation of control commands or trajectories by planning methods remains a prerequisite for IVs, owing to perceptual imperfections under complex environments, which pose an obstacle to the successful commercialization of IVs. This paper aims to review state-of-the-art planning methods, including pipeline planning and end-to-end planning methods. In terms of pipeline methods, a survey of selecting algorithms is provided along with a discussion of the expansion and optimization mechanisms, whereas in end-to-end methods, the training approaches and verification scenarios of driving tasks are points of concern. Experimental platforms are reviewed to facilitate readers in selecting suitable training and validation methods. Finally, the current challenges and future directions are discussed. The side-by-side comparison presented in this survey helps to gain insights into the strengths and limitations of the reviewed methods, which also assists with system-level design choices.

Learning · MIMO · Performance · 控制器 · Agent ·

2023 年 3 月 16 日

Energy Management of Multi-mode Plug-in Hybrid Electric Vehicle using Multi-agent Deep Reinforcement Learning

Min Hua,Cetengfei Zhang,Fanggang Zhang,Zhi Li,Xiaoli Yu,Hongming Xu,Quan Zhou

The recently emerging multi-mode plug-in hybrid electric vehicle (PHEV) technology is one of the pathways making contributions to decarbonization, and its energy management requires multiple-input and multiple-output (MIMO) control. At the present, the existing methods usually decouple the MIMO control into single-output (MISO) control and can only achieve its local optimal performance. To optimize the multi-mode vehicle globally, this paper studies a MIMO control method for energy management of the multi-mode PHEV based on multi-agent deep reinforcement learning (MADRL). By introducing a relevance ratio, a hand-shaking strategy is proposed to enable two learning agents to work collaboratively under the MADRL framework using the deep deterministic policy gradient (DDPG) algorithm. Unified settings for the DDPG agents are obtained through a sensitivity analysis of the influencing factors to the learning performance. The optimal working mode for the hand-shaking strategy is attained through a parametric study on the relevance ratio. The advantage of the proposed energy management method is demonstrated on a software-in-the-loop testing platform. The result of the study indiates that learning rate of the DDPG agents is the greatest factor in learning performance. Using the unified DDPG settings and a relevance ratio of 0.2, the proposed MADRL method can save up to 4% energy compared to the single-agent method.

優化器 · Performer · Better · MoDELS · 最優化 ·

2021 年 6 月 8 日

Multi-Agent Cooperative Bidding Games for Multi-Objective Optimization in e-Commercial Sponsored Search

Ziyu Guan,Hongchang Wu,Qingyu Cao,Hao Liu,Wei Zhao,Sheng Li,Cai Xu,Guang Qiu,Jian Xu,Bo Zheng

Bid optimization for online advertising from single advertiser's perspective has been thoroughly investigated in both academic research and industrial practice. However, existing work typically assume competitors do not change their bids, i.e., the wining price is fixed, leading to poor performance of the derived solution. Although a few studies use multi-agent reinforcement learning to set up a cooperative game, they still suffer the following drawbacks: (1) They fail to avoid collusion solutions where all the advertisers involved in an auction collude to bid an extremely low price on purpose. (2) Previous works cannot well handle the underlying complex bidding environment, leading to poor model convergence. This problem could be amplified when handling multiple objectives of advertisers which are practical demands but not considered by previous work. In this paper, we propose a novel multi-objective cooperative bid optimization formulation called Multi-Agent Cooperative bidding Games (MACG). MACG sets up a carefully designed multi-objective optimization framework where different objectives of advertisers are incorporated. A global objective to maximize the overall profit of all advertisements is added in order to encourage better cooperation and also to protect self-bidding advertisers. To avoid collusion, we also introduce an extra platform revenue constraint. We analyze the optimal functional form of the bidding formula theoretically and design a policy network accordingly to generate auction-level bids. Then we design an efficient multi-agent evolutionary strategy for model optimization. Offline experiments and online A/B tests conducted on the Taobao platform indicate both single advertiser's objective and global profit have been significantly improved compared to state-of-art methods.