This paper describes the design and control of a support and recovery system for use with planar legged robots. The system operates in three modes. First, it can be operated in a fully transparent mode where no forces are applied to the robot. In this mode, the system follows the robot closely to be able to quickly catch the robot if needed. Second, it can provide a vertical supportive force to assist a robot during operation. Third, it can catch the robot and pull it away from the ground after a failure to avoid falls and the associated damages. In this mode, the system automatically resets the robot after a trial allowing for multiple consecutive trials to be run without manual intervention. The supportive forces are applied to the robot through an actuated cable and pulley system that uses series elastic actuation with a unidirectional spring to enable truly transparent operation. The nonlinear nature of this system necessitates careful design of controllers to ensure predictable, safe behaviors. In this paper we introduce the mechatronic design of the recovery system, develop suitable controllers, and evaluate the system's performance on the bipedal robot RAMone.
Navigating a large-scaled robot in unknown and cluttered height-constrained environments is challenging. Not only is a fast and reliable planning algorithm required to go around obstacles, the robot should also be able to change its intrinsic dimension by crouching in order to travel underneath height-constrained regions. There are few mobile robots that are capable of handling such a challenge, and bipedal robots provide a solution. However, as bipedal robots have nonlinear and hybrid dynamics, trajectory planning while ensuring dynamic feasibility and safety on these robots is challenging. This paper presents an end-to-end vision-aided autonomous navigation framework which leverages three layers of planners and a variable walking height controller to enable bipedal robots to safely explore height-constrained environments. A vertically-actuated Spring-Loaded Inverted Pendulum (vSLIP) model is introduced to capture the robot's coupled dynamics of planar walking and vertical walking height. This reduced-order model is utilized to optimize for long-term and short-term safe trajectory plans. A variable walking height controller is leveraged to enable the bipedal robot to maintain stable periodic walking gaits while following the planned trajectory. The entire framework is tested and experimentally validated using a bipedal robot Cassie. This demonstrates reliable autonomy to drive the robot to safely avoid obstacles while walking to the goal location in various kinds of height-constrained cluttered environments.
This paper proposes a unified robust motion controller for the position and force control problems of compliant robot manipulators driven by Series Elastic Actuators (SEAs). It is shown that the dynamic model of the compliant robot includes not only matched but also mismatched disturbances that act on the system through a different channel from the control input. To tackle this complex robust control problem, the unified robust motion controller is synthesised by employing a second-order Disturbance Observer (DOb), which allows us to estimate not only disturbances but also their first and second order derivatives, and a novel controller design approach in state space. By using the Brunovsky canonical form transformation and the estimations of disturbances and their first and second order derivatives, the dynamic model of the robot is reconstructed so that a new system model that includes only matched disturbances is obtained for compliant robots driven by SEAs. The robust position and force controllers are simply designed by eliminating the matched disturbances of the reconstructed system model via the conventional DOb-based robust control method. The stability and performance of the proposed robust motion controllers are verified by simulations.
The target of reducing travel time only is insufficient to support the development of future smart transportation systems. To align with the United Nations Sustainable Development Goals (UN-SDG), a further reduction of fuel and emissions, improvements of traffic safety, and the ease of infrastructure deployment and maintenance should also be considered. Different from existing work focusing on the optimization of the control in either traffic light signal (to improve the intersection throughput), or vehicle speed (to stabilize the traffic), this paper presents a multi-agent deep reinforcement learning (DRL) system called CoTV, which Cooperatively controls both Traffic light signals and connected autonomous Vehicles (CAV). Therefore, our CoTV can well balance the achievement of the reduction of travel time, fuel, and emission. In the meantime, CoTV can also be easy to deploy by cooperating with only one CAV that is the nearest to the traffic light controller on each incoming road. This enables more efficient coordination between traffic light controllers and CAV, thus leading to the convergence of training CoTV under the large-scale multi-agent scenario that is traditionally difficult to converge. We give the detailed system design of CoTV, and demonstrate its effectiveness in a simulation study using SUMO under various grid maps and realistic urban scenarios with mixed-autonomy traffic.
A profound challenge for A-Life is to construct agents whose behavior is 'life-like' in a deep way. We propose an architecture and approach to constructing networks driving artificial agents, using processes analogous to the processes that construct and sculpt the brains of animals. Furthermore the instantiation of action is dynamic: the whole network responds in real-time to sensory inputs to activate effectors, rather than computing a representation of the optimal behavior and sending off an encoded representation to effector controllers. There are many parameters and we use an evolutionary algorithm to select them, in the context of a specific prey-capture task. We think this architecture may be useful for controlling small autonomous robots or drones, because it allows for a rapid response to changes in sensor inputs.
While artificial-intelligence-based methods suffer from lack of transparency, rule-based methods dominate in safety-critical systems. Yet, the latter cannot compete with the first ones in robustness to multiple requirements, for instance, simultaneously addressing safety, comfort, and efficiency. Hence, to benefit from both methods they must be joined in a single system. This paper proposes a decision making and control framework, which profits from advantages of both the rule- and machine-learning-based techniques while compensating for their disadvantages. The proposed method embodies two controllers operating in parallel, called Safety and Learned. A rule-based switching logic selects one of the actions transmitted from both controllers. The Safety controller is prioritized every time, when the Learned one does not meet the safety constraint, and also directly participates in the safe Learned controller training. Decision making and control in autonomous driving is chosen as the system case study, where an autonomous vehicle learns a multi-task policy to safely cross an unprotected intersection. Multiple requirements (i.e., safety, efficiency, and comfort) are set for vehicle operation. A numerical simulation is performed for the proposed framework validation, where its ability to satisfy the requirements and robustness to changing environment is successfully demonstrated.
This study presents the design of a six-wheeled outdoor autonomous mobile robot. The main design goal of our robot is to increase its adaptability and flexibility when moving outdoors. This six-wheeled robot platform was equipped with some sensors, such as a global positioning system (GPS), high definition (HD) webcam, light detection and ranging (LiDAR), and rotary encoders. A personal mobile computer and 86Duino ONE microcontroller were used as the algorithm computing platform. In terms of control, the lateral offset and head angle offset of the robot were calculated using a differential GPS or a camera to detect structured and unstructured road boundaries. The lateral offset and head angle offset were fed to a fuzzy controller. The control input was designed by Q-learning of the differential speed between the left and right wheels. This made the robot track a reference route so that it could stay in its own lane. 2D LiDAR was also used to measure the relative distance from the front obstacle. The robot would immediately stop to avoid a collision when the distance between the robot and obstacle was less than a specific safety distance. A custom-designed rocker arm gave the robot the ability to climb a low step. Body balance could be maintained by controlling the angle of the rocker arm when the robot changed its pose. The autonomous mobile robot has been used for delivery service on our campus road by integrating the above system functionality.
Driven by the fast development of Internet of Things (IoT) applications, tremendous data need to be collected by sensors and passed to the servers for further process. As a promising solution, the mobile crowd sensing (MCS) enables controllable sensing and transmission processes for multiple types of data in a single device. To achieve the energy efficient MCS, the data sensing and transmission over a long-term time duration should be designed accounting for the differentiated requirements of IoT tasks including data size and delay tolerance. The said design is achieved by jointly optimizing the sensing and transmission rates, which leads to a complex optimization problem due to the restraining relationship between the controlling variables as well as the existence of busy time interval during which no data can be sensed. To deal with such problem, a vital concept namely height is introduced, based on which the classical string-pulling algorithms can be applied for obtaining the corresponding optimal sensing and transmission rates. Therefore, the original rates optimization problem can be converted to a searching problem for the optimal height. Based on the property of the objective function, the upper and lower bounds of the area where the optimal height lies in are derived. The whole searching area is further divided into a series of sub-areas due to the format change of the objective function with the varying heights. Finally, the optimal height in each sub-area is obtained based on the convexity of the objective function and the global optimal height is further determined by comparing the local optimums. The above solving approach is further extended for the case with limited data buffer capacity of the server. Simulations are conducted to evaluate the performance of the proposed design.
Convolutions on monocular dash cam videos capture spatial invariances in the image plane but do not explicitly reason about distances and depth. We propose a simple transformation of observations into a bird's eye view, also known as plan view, for end-to-end control. We detect vehicles and pedestrians in the first person view and project them into an overhead plan view. This representation provides an abstraction of the environment from which a deep network can easily deduce the positions and directions of entities. Additionally, the plan view enables us to leverage advances in 3D object detection in conjunction with deep policy learning. We evaluate our monocular plan view network on the photo-realistic Grand Theft Auto V simulator. A network using both a plan view and front view causes less than half as many collisions as previous detection-based methods and an order of magnitude fewer collisions than pure pixel-based policies.
Autonomous urban driving navigation with complex multi-agent dynamics is under-explored due to the difficulty of learning an optimal driving policy. The traditional modular pipeline heavily relies on hand-designed rules and the pre-processing perception system while the supervised learning-based models are limited by the accessibility of extensive human experience. We present a general and principled Controllable Imitative Reinforcement Learning (CIRL) approach which successfully makes the driving agent achieve higher success rates based on only vision inputs in a high-fidelity car simulator. To alleviate the low exploration efficiency for large continuous action space that often prohibits the use of classical RL on challenging real tasks, our CIRL explores over a reasonably constrained action space guided by encoded experiences that imitate human demonstrations, building upon Deep Deterministic Policy Gradient (DDPG). Moreover, we propose to specialize adaptive policies and steering-angle reward designs for different control signals (i.e. follow, straight, turn right, turn left) based on the shared representations to improve the model capability in tackling with diverse cases. Extensive experiments on CARLA driving benchmark demonstrate that CIRL substantially outperforms all previous methods in terms of the percentage of successfully completed episodes on a variety of goal-directed driving tasks. We also show its superior generalization capability in unseen environments. To our knowledge, this is the first successful case of the learned driving policy through reinforcement learning in the high-fidelity simulator, which performs better-than supervised imitation learning.
This paper introduces a novel neural network-based reinforcement learning approach for robot gaze control. Our approach enables a robot to learn and to adapt its gaze control strategy for human-robot interaction neither with the use of external sensors nor with human supervision. The robot learns to focus its attention onto groups of people from its own audio-visual experiences, independently of the number of people, of their positions and of their physical appearances. In particular, we use a recurrent neural network architecture in combination with Q-learning to find an optimal action-selection policy; we pre-train the network using a simulated environment that mimics realistic scenarios that involve speaking/silent participants, thus avoiding the need of tedious sessions of a robot interacting with people. Our experimental evaluation suggests that the proposed method is robust against parameter estimation, i.e. the parameter values yielded by the method do not have a decisive impact on the performance. The best results are obtained when both audio and visual information is jointly used. Experiments with the Nao robot indicate that our framework is a step forward towards the autonomous learning of socially acceptable gaze behavior.