亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

For humanoids to be deployed in demanding situations, such as search and rescue, highly intelligent decision making and proficient sensorimotor skill is expected. A promising solution is to leverage human prowess by interconnecting robot and human via teleoperation. Towards creating seamless operation, this paper presents a dynamic telelocomotion framework that synchronizes the gait of a human pilot with the walking of a bipedal robot. First, we introduce a method to generate a virtual human walking model from the stepping behavior of a human pilot which serves as a reference for the robot to walk. Second, the dynamics of the walking reference and robot walking are synchronized by applying forces to the human pilot and the robot to achieve dynamic similarity between the two systems. This enables the human pilot to continuously perceive and cancel any asynchrony between the walking reference and robot. A consistent step placement strategy for the robot is derived to maintain dynamic similarity through step transitions. Using our human-machine-interface, we demonstrate that the human pilot can achieve stable and synchronous teleoperation of a simulated robot through stepping-in-place, walking, and disturbance rejection experiments. This work provides a fundamental step towards transferring human intelligence and reflexes to humanoid robots.

相關內容

機器人(英語:Robot)包括一切模擬人類行為或思想與模擬其他生物的機械(如機器狗,機器貓等)。狹義上對機器人的定義還有很多分類法及爭議,有些電腦程序甚至也被稱為機器人。在當代工業中,機器人指能自動運行任務的人造機器設備,用以取代或協助人類工作,一般會是機電設備,由計算機程序或是電子電路控制。

知識薈萃

精品入門和進階教程、論文和代碼整理等

更多

查看相關VIP內容、論文、資訊等

Compared to on-policy policy gradient techniques, off-policy model-free deep reinforcement learning (RL) that uses previously gathered data can improve sampling efficiency. However, off-policy learning becomes challenging when the discrepancy between the distributions of the policy of interest and the policies that collected the data increases. Although the well-studied importance sampling and off-policy policy gradient techniques were proposed to compensate for this discrepancy, they usually require a collection of long trajectories that increases the computational complexity and induce additional problems such as vanishing/exploding gradients or discarding many useful experiences. Moreover, their generalization to continuous action domains is strictly limited as they require action probabilities, which is unsuitable for deterministic policies. To overcome these limitations, we introduce a novel policy similarity measure to mitigate the effects of such discrepancy. Our method offers an adequate single-step off-policy correction without any probability estimates, and theoretical results show that it can achieve a contraction mapping with a fixed unique point, which allows "safe" off-policy learning. An extensive set of empirical results indicate that our algorithm substantially improves the state-of-the-art and attains higher returns in fewer steps than the competing methods by efficiently scheduling the learning rate in Q-learning and policy optimization.

An intuitive control method for the flying trot, which combines offline trajectory planning with real-time balance control, is presented. The motion features of running animals in the vertical direction were analysed using the spring-load-inverted-pendulum (SLIP) model, and the foot trajectory of the robot was planned, so the robot could run similar to an animal capable of vertical flight, according to the given height and speed of the trunk. To improve the robustness of running, a posture control method based on a foot acceleration adjustment is proposed. A novel kinematic based CoM observation method and CoM regulation method is present to enhance the stability of locomotion. To reduce the impact force when the robot interacts with the environment, the virtual model control method is used in the control of the foot trajectory to achieve active compliance. By selecting the proper parameters for the virtual model, the oscillation motion of the virtual model and the planning motion of the support foot are synchronized to avoid the large disturbance caused by the oscillation motion of the virtual model in relation to the robot motion. The simulation and experiment using the quadruped robot Billy are reported. In the experiment, the maximum speed of the robot could reach 4.73 times the body length per second, which verified the feasibility of the control method.

Controller design for bipedal walking on dynamic rigid surfaces (DRSes), which are rigid surfaces moving in the inertial frame (e.g., ships and airplanes), remains largely uninvestigated. This paper introduces a hierarchical control approach that achieves stable underactuated bipedal robot walking on a horizontally oscillating DRS. The highest layer of our approach is a real-time motion planner that generates desired global behaviors (i.e., the center of mass trajectories and footstep locations) by stabilizing a reduced-order robot model. One key novelty of this layer is the derivation of the reduced-order model by analytically extending the angular momentum based linear inverted pendulum (ALIP) model from stationary to horizontally moving surfaces. The other novelty is the development of a discrete-time foot-placement controller that exponentially stabilizes the hybrid, linear, time-varying ALIP model. The middle layer of the proposed approach is a walking pattern generator that translates the desired global behaviors into the robot's full-body reference trajectories for all directly actuated degrees of freedom. The lowest layer is an input-output linearizing controller that exponentially tracks those full-body reference trajectories based on the full-order, hybrid, nonlinear robot dynamics. Simulations of planar underactuated bipedal walking on a swaying DRS confirm that the proposed framework ensures the walking stability under different DRS motions and gait types.

Numerical optimization has become a popular approach to plan smooth motion trajectories for robots. However, when sharing space with humans, balancing properly safety, comfort and efficiency still remains challenging. This is notably the case because humans adapt their behavior to that of the robot, raising the need for intricate planning and prediction. In this paper, we propose a novel optimization-based motion planning algorithm, which generates robot motions, while simultaneously maximizing the human trajectory likelihood under a data-driven predictive model. Considering planning and prediction together allows us to formulate objective and constraint functions in the joint human-robot state space. Key to the approach are added latent space modifiers to a differentiable human predictive model based on a dedicated recurrent neural network. These modifiers allow to change the human prediction within motion optimization. We empirically evaluate our method using the publicly available MoGaze dataset. Our results indicate that the proposed framework outperforms current baselines for planning handover trajectories and avoiding collisions between a robot and a human. Our experiments demonstrate collaborative motion trajectories, where both, the human prediction and the robot plan, adapt to each other.

This study presents a whole-body model predictive control (MPC) of robotic systems with rigid contacts, under a given contact sequence using online switching time optimization (STO). We treat robot dynamics with rigid contacts as a switched system and formulate an optimal control problem of switched systems to implement the MPC. We utilize an efficient solution algorithm for the MPC problem that optimizes the switching times and trajectory simultaneously. The present efficient algorithm, unlike inefficient existing methods, enables online optimization as well as switching times. The proposed MPC with online STO is compared over the conventional MPC with fixed switching times, through numerical simulations of dynamic jumping motions of a quadruped robot. In the simulation comparison, the proposed MPC successfully controls the dynamic jumping motions in twice as many cases as the conventional MPC, which indicates that the proposed method extends the ability of the whole-body MPC. We further conduct hardware experiments on the quadrupedal robot Unitree A1 and prove that the proposed method achieves dynamic motions on the real robot.

We propose coordinating guiding vector fields to achieve two tasks simultaneously with a team of robots: first, the guidance and navigation of multiple robots to possibly different paths or surfaces typically embedded in 2D or 3D; second, their motion coordination while tracking their prescribed paths or surfaces. The motion coordination is defined by desired parametric displacements between robots on the path or surface. Such a desired displacement is achieved by controlling the virtual coordinates, which correspond to the path or surface's parameters, between guiding vector fields. Rigorous mathematical guarantees underpinned by dynamical systems theory and Lyapunov theory are provided for the effective distributed motion coordination and navigation of robots on paths or surfaces from all initial positions. As an example for practical robotic applications, we derive a control algorithm from the proposed coordinating guiding vector fields for a Dubins-car-like model with actuation saturation. Our proposed algorithm is distributed and scalable to an arbitrary number of robots. Furthermore, extensive illustrative simulations and fixed-wing aircraft outdoor experiments validate the effectiveness and robustness of our algorithm.

Robotic manipulation stands as a largely unsolved problem despite significant advances in robotics and machine learning in recent years. One of the key challenges in manipulation is the exploration of the dynamics of the environment when there is continuous contact between the objects being manipulated. This paper proposes a model-based active exploration approach that enables efficient learning in sparse-reward robotic manipulation tasks. The proposed method estimates an information gain objective using an ensemble of probabilistic models and deploys model predictive control (MPC) to plan actions online that maximize the expected reward while also performing directed exploration. We evaluate our proposed algorithm in simulation and on a real robot, trained from scratch with our method, on a challenging ball pushing task on tilted tables, where the target ball position is not known to the agent a-priori. Our real-world robot experiment serves as a fundamental application of active exploration in model-based reinforcement learning of complex robotic manipulation tasks.

Sim-to-real is a mainstream method to cope with the large number of trials needed by typical deep reinforcement learning. However, transferring a policy trained in simulation to actual hardware remains challenging due to the reality gap. In particular, the characteristics of actuators in legged robots have a considerable influence on sim-to-real transfer. High reduction ratio gears are widely used in actuators, and the reality gap issue becomes especially pronounced when even the utilization of backdrivability is considered to control joints compliantly. We propose a new simulation model of gears to address this gap. Additionally, the difficulty in achieving stable bipedal locomotion causes typical methods to fail to tune physical parameters in simulation with the behavior of transferred policy. Thus, we propose a method for system identification that can utilize failed attempts. The method's effectiveness is verified using a biped robot, the ROBOTIS-OP3, and the sim-to-real transferred policy can stabilize the robot under severe disturbances and walk on uneven surfaces without force and torque sensors.

Understanding the impact of the most effective policies or treatments on a response variable of interest is desirable in many empirical works in economics, statistics and other disciplines. Due to the widespread winner's curse phenomenon, conventional statistical inference assuming that the top policies are chosen independent of the random sample may lead to overly optimistic evaluations of the best policies. In recent years, given the increased availability of large datasets, such an issue can be further complicated when researchers include many covariates to estimate the policy or treatment effects in an attempt to control for potential confounders. In this manuscript, to simultaneously address the above-mentioned issues, we propose a resampling-based procedure that not only lifts the winner's curse in evaluating the best policies observed in a random sample, but also is robust to the presence of many covariates. The proposed inference procedure yields accurate point estimates and valid frequentist confidence intervals that achieve the exact nominal level as the sample size goes to infinity for multiple best policy effect sizes. We illustrate the finite-sample performance of our approach through Monte Carlo experiments and two empirical studies, evaluating the most effective policies in charitable giving and the most beneficial group of workers in the National Supported Work program.

This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.

北京阿比特科技有限公司