This paper draws upon three themes in the bipedal control literature to achieve highly agile, terrain-aware locomotion. By terrain aware, we mean the robot can use information on terrain slope and friction cone as supplied by state-of-the-art mapping and trajectory planning algorithms. The process starts with abstracting from the full dynamics of a Cassie 3D bipedal robot, an exact low-dimensional representation of its centroidal dynamics, parameterized by angular momentum. Under a piecewise planar terrain assumption, and the elimination of terms for the angular momentum about the robot's center of mass, the centroidal dynamics become linear and has dimension four. Four-step-horizon model predictive control (MPC) of the centroidal dynamics provides step-to-step foot placement commands. Importantly, we also include the intra-step dynamics at 10 ms intervals so that realistic terrain-aware constraints on robot's evolution can be imposed in the MPC formulation. The output of the MPC is directly implemented on Cassie through the method of virtual constraints. In experiments, we validate the performance of our control strategy for the robot on inclined and stationary terrain, both indoors on a treadmill and outdoors on a hill.
This work presents the coordinated motion control and obstacle-crossing problem for the four wheel-leg independent motor-driven robotic systems via a model predictive control (MPC) approach based on an event-triggering mechanism. The modeling of a wheel-leg robotic control system with a dynamic supporting polygon is organized. The system dynamic model is 3 degrees of freedom (DOF) ignoring the pitch, roll and vertical motions. The single wheel dynamic is analyzed considering the characteristics of motor-driven and the Burckhardt nonlinear tire model. As a result, an over-actuated predictive model is proposed with the motor torques as inputs and the system states as outputs. As the supporting polygon is only adjusted at certain conditions, an event-based triggering mechanism is designed to save hardware resources and energy. The MPC controller is evaluated on a virtual prototype as well as a physical prototype. The simulation results guide the parameter tuning for the controller implementation in the physical prototype. The experimental results on these two prototypes verify the efficiency of the proposed approach.
We focus on the problem of planning the motion of a robot in a dynamic multiagent environment such as a pedestrian scene. Enabling the robot to navigate safely and in a socially compliant fashion in such scenes requires a representation that accounts for the unfolding multiagent dynamics. Existing approaches to this problem tend to employ microscopic models of motion prediction that reason about the individual behavior of other agents. While such models may achieve high tracking accuracy in trajectory prediction benchmarks, they often lack an understanding of the group structures unfolding in crowded scenes. Inspired by the Gestalt theory from psychology, we build a Model Predictive Control framework (G-MPC) that leverages group-based prediction for robot motion planning. We conduct an extensive simulation study involving a series of challenging navigation tasks in scenes extracted from two real-world pedestrian datasets. We illustrate that G-MPC enables a robot to achieve statistically significantly higher safety and lower number of group intrusions than a series of baselines featuring individual pedestrian motion prediction models. Finally, we show that G-MPC can handle noisy lidar-scan estimates without significant performance losses.
Prediction+optimization is a common real-world paradigm where we have to predict problem parameters before solving the optimization problem. However, the criteria by which the prediction model is trained are often inconsistent with the goal of the downstream optimization problem. Recently, decision-focused prediction approaches, such as SPO+ and direct optimization, have been proposed to fill this gap. However, they cannot directly handle the soft constraints with the $max$ operator required in many real-world objectives. This paper proposes a novel analytically differentiable surrogate objective framework for real-world linear and semi-definite negative quadratic programming problems with soft linear and non-negative hard constraints. This framework gives the theoretical bounds on constraints' multipliers, and derives the closed-form solution with respect to predictive parameters and thus gradients for any variable in the problem. We evaluate our method in three applications extended with soft constraints: synthetic linear programming, portfolio optimization, and resource provisioning, demonstrating that our method outperforms traditional two-staged methods and other decision-focused approaches.
We focus on the problem of developing energy efficient controllers for quadrupedal robots. Animals can actively switch gaits at different speeds to lower their energy consumption. In this paper, we devise a hierarchical learning framework, in which distinctive locomotion gaits and natural gait transitions emerge automatically with a simple reward of energy minimization. We use evolutionary strategies (ES) to train a high-level gait policy that specifies gait patterns of each foot, while the low-level convex MPC controller optimizes the motor commands so that the robot can walk at a desired velocity using that gait pattern. We test our learning framework on a quadruped robot and demonstrate automatic gait transitions, from walking to trotting and to fly-trotting, as the robot increases its speed. We show that the learned hierarchical controller consumes much less energy across a wide range of locomotion speed than baseline controllers.
A new approach is presented to the problem of compensating the beam squint effect arising in wideband terahertz (THz) hybrid massive multiple-input multiple-output (MIMO) systems, based on the joint optimization of the phase shifter (PS) and true-time delay (TTD) values under per-TTD device time delay constraints. Unlike the prior approaches, the new approach does not require the unbounded time delay assumption; the range of time delay values that a TTD device can produce is strictly limited in our approach. Instead of focusing on the design of TTD values, we jointly optimize both the TTD and PS values to effectively cope with the practical time delay constraint. Simulation results that illustrate the performance benefits of the new method for the beam squint compensation are presented. Through simulations and analysis, we show that our approach is a generalization of the prior TTD-based precoding approaches.
Online Controlled Experiments (OCE) are the gold standard to measure impact and guide decisions for digital products and services. Despite many methodological advances in this area, the scarcity of public datasets and the lack of a systematic review and categorization hinder its development. We present the first survey and taxonomy for OCE datasets, which highlight the lack of a public dataset to support the design and running of experiments with adaptive stopping, an increasingly popular approach to enable quickly deploying improvements or rolling back degrading changes. We release the first such dataset, containing daily checkpoints of decision metrics from multiple, real experiments run on a global e-commerce platform. The dataset design is guided by a broader discussion on data requirements for common statistical tests used in digital experimentation. We demonstrate how to use the dataset in the adaptive stopping scenario using sequential and Bayesian hypothesis tests and learn the relevant parameters for each approach.
Mechanizing the manual harvesting of fresh market fruits constitutes one of the biggest challenges to the sustainability of the fruit industry. During manual harvesting of some fresh-market crops like strawberries and table grapes, pickers spend significant amounts of time walking to carry full trays to a collection station at the edge of the field. A step toward increasing harvest automation for such crops is to deploy harvest-aid collaborative robots (co-bots) that transport the empty and full trays, thus increasing harvest efficiency by reducing pickers' non-productive walking times. This work presents the development of a co-robotic harvest-aid system and its evaluation during commercial strawberry harvesting. At the heart of the system lies a predictive stochastic scheduling algorithm that minimizes the expected non-picking time, thus maximizing the harvest efficiency. During the evaluation experiments, the co-robots improved the mean harvesting efficiency by around 10% and reduced the mean non-productive time by 60%, when the robot-to-picker ratio was 1:3. The concepts developed in this work can be applied to robotic harvest-aids for other manually harvested crops that involve walking for crop transportation.
The objective of this work is to design a mechatronic bipedal robot with mobility in 3D environments. The designed robot has a total of six actuated degrees of freedom (DoF), each leg has two DoF located at the hip: one for abduction/adduction and another for thigh flexion/extension, and a third DoF at the knee for the shin flexion/extension. This robot is designed with point-feet legs to achieve a dynamic underactuated walking. Each actuator in the robot includes a DC gear motor, an encoder for position measurement, a flexible joint to form a series flexible actuator, and a feedback controller to ensure trajectory tracking. In order to reduce the total mass of the robot, the shin is designed using topology optimization. The resulting design is fabricated using 3D printed parts, which allows to get a robot's prototype to validate the selection of actuators. The preliminary experiments confirm the robot's ability to maintain a stand-up position and let us drawn future works in dynamic control and trajectory generation for periodic stable walking.
We study constrained reinforcement learning (CRL) from a novel perspective by setting constraints directly on state density functions, rather than the value functions considered by previous works. State density has a clear physical and mathematical interpretation, and is able to express a wide variety of constraints such as resource limits and safety requirements. Density constraints can also avoid the time-consuming process of designing and tuning cost functions required by value function-based constraints to encode system specifications. We leverage the duality between density functions and Q functions to develop an effective algorithm to solve the density constrained RL problem optimally and the constrains are guaranteed to be satisfied. We prove that the proposed algorithm converges to a near-optimal solution with a bounded error even when the policy update is imperfect. We use a set of comprehensive experiments to demonstrate the advantages of our approach over state-of-the-art CRL methods, with a wide range of density constrained tasks as well as standard CRL benchmarks such as Safety-Gym.
Machine learning algorithms have found several applications in the field of robotics and control systems. The control systems community has started to show interest towards several machine learning algorithms from the sub-domains such as supervised learning, imitation learning and reinforcement learning to achieve autonomous control and intelligent decision making. Amongst many complex control problems, stable bipedal walking has been the most challenging problem. In this paper, we present an architecture to design and simulate a planar bipedal walking robot(BWR) using a realistic robotics simulator, Gazebo. The robot demonstrates successful walking behaviour by learning through several of its trial and errors, without any prior knowledge of itself or the world dynamics. The autonomous walking of the BWR is achieved using reinforcement learning algorithm called Deep Deterministic Policy Gradient(DDPG). DDPG is one of the algorithms for learning controls in continuous action spaces. After training the model in simulation, it was observed that, with a proper shaped reward function, the robot achieved faster walking or even rendered a running gait with an average speed of 0.83 m/s. The gait pattern of the bipedal walker was compared with the actual human walking pattern. The results show that the bipedal walking pattern had similar characteristics to that of a human walking pattern.