Drift control is significant to the safety of autonomous vehicles when there is a sudden loss of traction due to external conditions such as rain or snow. It is a challenging control problem due to the presence of significant sideslip and nearly full saturation of the tires. In this paper, we focus on the control of drift maneuvers following circular paths with either fixed or moving centers, subject to change in the tire-ground interaction, which are common training tasks for drift enthusiasts and can therefore be used as benchmarks of the performance of drift control. In order to achieve the above tasks, we propose a novel hierarchical control architecture which decouples the curvature and center control of the trajectory. In particular, an outer loop stabilizes the center by tuning the target curvature, and an inner loop tracks the curvature using a feedforward/feedback controller enhanced by an $\mathcal{L}_1$ adaptive component. The hierarchical architecture is flexible because the inner loop is task-agnostic and adaptive to changes in tire-road interaction, which allows the outer loop to be designed independent of low-level dynamics, opening up the possibility of incorporating sophisticated planning algorithms. We implement our control strategy on a simulation platform as well as on a 1/10 scale Radio-Control~(RC) car, and both the simulation and experiment results illustrate the effectiveness of our strategy in achieving the above described set of drift maneuvering tasks.
This paper addresses the problem of learning abstractions that boost robot planning performance while providing strong guarantees of reliability. Although state-of-the-art hierarchical robot planning algorithms allow robots to efficiently compute long-horizon motion plans for achieving user desired tasks, these methods typically rely upon environment-dependent state and action abstractions that need to be hand-designed by experts. We present a new approach for bootstrapping the entire hierarchical planning process. It shows how abstract states and actions for new environments can be computed automatically using the critical regions predicted by a deep neural-network with an auto-generated robot specific architecture. It uses the learned abstractions in a novel multi-source bi-directional hierarchical robot planning algorithm that is sound and probabilistically complete. An extensive empirical evaluation on twenty different settings using holonomic and non-holonomic robots shows that (a) the learned abstractions provide the information necessary for efficient multi-source hierarchical planning; and that (b) this approach of learning abstraction and planning outperforms state-of-the-art baselines by nearly a factor of ten in terms of planning time on test environments not seen during training.
Building autonomous vehicles (AVs) is a complex problem, but enabling them to operate in the real world where they will be surrounded by human-driven vehicles (HVs) is extremely challenging. Prior works have shown the possibilities of creating inter-agent cooperation between a group of AVs that follow a social utility. Such altruistic AVs can form alliances and affect the behavior of HVs to achieve socially desirable outcomes. We identify two major challenges in the co-existence of AVs and HVs. First, social preferences and individual traits of a given human driver, e.g., selflessness and aggressiveness are unknown to an AV, and it is almost impossible to infer them in real-time during a short AV-HV interaction. Second, contrary to AVs that are expected to follow a policy, HVs do not necessarily follow a stationary policy and therefore are extremely hard to predict. To alleviate the above-mentioned challenges, we formulate the mixed-autonomy problem as a multi-agent reinforcement learning (MARL) problem and propose a decentralized framework and reward function for training cooperative AVs. Our approach enables AVs to learn the decision-making of HVs implicitly from experience, optimizes for a social utility while prioritizing safety and allowing adaptability; robustifying altruistic AVs to different human behaviors and constraining them to a safe action space. Finally, we investigate the robustness, safety and sensitivity of AVs to various HVs behavioral traits and present the settings in which the AVs can learn cooperative policies that are adaptable to different situations.
Hierarchical semantic structures naturally exist in an image dataset, in which several semantically relevant image clusters can be further integrated into a larger cluster with coarser-grained semantics. Capturing such structures with image representations can greatly benefit the semantic understanding on various downstream tasks. Existing contrastive representation learning methods lack such an important model capability. In addition, the negative pairs used in these methods are not guaranteed to be semantically distinct, which could further hamper the structural correctness of learned image representations. To tackle these limitations, we propose a novel contrastive learning framework called Hierarchical Contrastive Selective Coding (HCSC). In this framework, a set of hierarchical prototypes are constructed and also dynamically updated to represent the hierarchical semantic structures underlying the data in the latent space. To make image representations better fit such semantic structures, we employ and further improve conventional instance-wise and prototypical contrastive learning via an elaborate pair selection scheme. This scheme seeks to select more diverse positive pairs with similar semantics and more precise negative pairs with truly distinct semantics. On extensive downstream tasks, we verify the superior performance of HCSC over state-of-the-art contrastive methods, and the effectiveness of major model components is proved by plentiful analytical studies. Our source code and model weights are available at //github.com/gyfastas/HCSC
Navigating a large-scaled robot in unknown and cluttered height-constrained environments is challenging. Not only is a fast and reliable planning algorithm required to go around obstacles, the robot should also be able to change its intrinsic dimension by crouching in order to travel underneath height-constrained regions. There are few mobile robots that are capable of handling such a challenge, and bipedal robots provide a solution. However, as bipedal robots have nonlinear and hybrid dynamics, trajectory planning while ensuring dynamic feasibility and safety on these robots is challenging. This paper presents an end-to-end vision-aided autonomous navigation framework which leverages three layers of planners and a variable walking height controller to enable bipedal robots to safely explore height-constrained environments. A vertically-actuated Spring-Loaded Inverted Pendulum (vSLIP) model is introduced to capture the robot's coupled dynamics of planar walking and vertical walking height. This reduced-order model is utilized to optimize for long-term and short-term safe trajectory plans. A variable walking height controller is leveraged to enable the bipedal robot to maintain stable periodic walking gaits while following the planned trajectory. The entire framework is tested and experimentally validated using a bipedal robot Cassie. This demonstrates reliable autonomy to drive the robot to safely avoid obstacles while walking to the goal location in various kinds of height-constrained cluttered environments.
The target of reducing travel time only is insufficient to support the development of future smart transportation systems. To align with the United Nations Sustainable Development Goals (UN-SDG), a further reduction of fuel and emissions, improvements of traffic safety, and the ease of infrastructure deployment and maintenance should also be considered. Different from existing work focusing on the optimization of the control in either traffic light signal (to improve the intersection throughput), or vehicle speed (to stabilize the traffic), this paper presents a multi-agent deep reinforcement learning (DRL) system called CoTV, which Cooperatively controls both Traffic light signals and connected autonomous Vehicles (CAV). Therefore, our CoTV can well balance the achievement of the reduction of travel time, fuel, and emission. In the meantime, CoTV can also be easy to deploy by cooperating with only one CAV that is the nearest to the traffic light controller on each incoming road. This enables more efficient coordination between traffic light controllers and CAV, thus leading to the convergence of training CoTV under the large-scale multi-agent scenario that is traditionally difficult to converge. We give the detailed system design of CoTV, and demonstrate its effectiveness in a simulation study using SUMO under various grid maps and realistic urban scenarios with mixed-autonomy traffic.
In this paper, the problem of coordinated transportation of heavy payload by a team of UAVs in a cluttered environment is addressed. The payload is modeled as a rigid body and is assumed to track a pre-computed global flight trajectory from a start point to a goal point. Due to the presence of local dynamic obstacles in the environment, the UAVs must ensure that there is no collision between the payload and these obstacles while ensuring that the payload oscillations are kept minimum. An Integrated Decision Controller (IDC) is proposed, that integrates the optimal tracking control law given by a centralized Model Predictive Controller with safety-critical constraints provided by the Exponential Control Barrier Functions. The entire payload-UAV system is enclosed by a safe convex hull boundary, and the IDC ensures that no obstacle enters this boundary. To evaluate the performance of the IDC, the results for a numerical simulation as well as a high-fidelity Gazebo simulation are presented. An ablation study is conducted to analyze the robustness of the proposed IDC against practical dubieties like noisy state values, relative obstacle safety margin, and payload mass uncertainty. The results clearly show that the IDC achieves both trajectory tracking and obstacle avoidance successfully while restricting the payload oscillations within a safe limit.
Modularity is a central principle throughout the design process for cyber-physical systems. Modularity reduces complexity and increases reuse of behavior. In this paper we pose and answer the following question: how can we identify independent `modules' within the structure of reactive control architectures? To this end, we propose a graph-structured control architecture we call a decision structure, and show how it generalises some reactive control architectures which are popular in Artificial Intelligence (AI) and robotics, specifically Teleo-Reactive programs (TRs), Decision Trees (DTs), Behavior Trees (BTs) and Generalised Behavior Trees ($k$-BTs). Inspired by the definition of a module in graph theory, we define modules in decision structures and show how each decision structure possesses a canonical decomposition into its modules. We can naturally characterise each of the BTs, $k$-BTs, DTs and TRs by properties of their module decomposition. This allows us to recognise which decision structures are equivalent to each of these architectures in quadratic time. Our proposed concept of modules extends to formal verification, under any verification scheme capable of verifying a decision structure. Namely, we prove that a modification to a module within a decision structure has no greater flow-on effects than a modification to an individual action within that structure. This enables verification on modules to be done locally and hierarchically, where structures can be verified and then repeatedly locally modified, with modules replaced by modules while preserving correctness. To illustrate the findings, we present an example of a solar-powered drone controlled by a decision structure. We use a Linear Temporal Logic-based verification scheme to verify the correctness of this structure, and then show how one can modify modules while preserving its correctness.
We introduce ApolloRL, an open platform for research in reinforcement learning for autonomous driving. The platform provides a complete closed-loop pipeline with training, simulation, and evaluation components. It comes with 300 hours of real-world data in driving scenarios and popular baselines such as Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC) agents. We elaborate in this paper on the architecture and the environment defined in the platform. In addition, we discuss the performance of the baseline agents in the ApolloRL environment.
Although deep reinforcement learning (deep RL) methods have lots of strengths that are favorable if applied to autonomous driving, real deep RL applications in autonomous driving have been slowed down by the modeling gap between the source (training) domain and the target (deployment) domain. Unlike current policy transfer approaches, which generally limit to the usage of uninterpretable neural network representations as the transferred features, we propose to transfer concrete kinematic quantities in autonomous driving. The proposed robust-control-based (RC) generic transfer architecture, which we call RL-RC, incorporates a transferable hierarchical RL trajectory planner and a robust tracking controller based on disturbance observer (DOB). The deep RL policies trained with known nominal dynamics model are transfered directly to the target domain, DOB-based robust tracking control is applied to tackle the modeling gap including the vehicle dynamics errors and the external disturbances such as side forces. We provide simulations validating the capability of the proposed method to achieve zero-shot transfer across multiple driving scenarios such as lane keeping, lane changing and obstacle avoidance.
Matter evolved under influence of gravity from minuscule density fluctuations. Non-perturbative structure formed hierarchically over all scales, and developed non-Gaussian features in the Universe, known as the Cosmic Web. To fully understand the structure formation of the Universe is one of the holy grails of modern astrophysics. Astrophysicists survey large volumes of the Universe and employ a large ensemble of computer simulations to compare with the observed data in order to extract the full information of our own Universe. However, to evolve trillions of galaxies over billions of years even with the simplest physics is a daunting task. We build a deep neural network, the Deep Density Displacement Model (hereafter D$^3$M), to predict the non-linear structure formation of the Universe from simple linear perturbation theory. Our extensive analysis, demonstrates that D$^3$M outperforms the second order perturbation theory (hereafter 2LPT), the commonly used fast approximate simulation method, in point-wise comparison, 2-point correlation, and 3-point correlation. We also show that D$^3$M is able to accurately extrapolate far beyond its training data, and predict structure formation for significantly different cosmological parameters. Our study proves, for the first time, that deep learning is a practical and accurate alternative to approximate simulations of the gravitational structure formation of the Universe.