精品夜色国产国偷自产乱码,亚洲欧洲国产精品你懂的,日日碰日日摸日日澡97

Learning-based control approaches have shown great promise in performing complex tasks directly from high-dimensional perception data for real robotic systems. Nonetheless, the learned controllers can behave unexpectedly if the trajectories of the system divert from the training data distribution, which can compromise safety. In this work, we propose a control filter that wraps any reference policy and effectively encourages the system to stay in-distribution with respect to offline-collected safe demonstrations. Our methodology is inspired by Control Barrier Functions (CBFs), which are model-based tools from the nonlinear control literature that can be used to construct minimally invasive safe policy filters. While existing methods based on CBFs require a known low-dimensional state representation, our proposed approach is directly applicable to systems that rely solely on high-dimensional visual observations by learning in a latent state-space. We demonstrate that our method is effective for two different visuomotor control tasks in simulation environments, including both top-down and egocentric view settings.

相關內容

控制器

關注 5

主動學習 · Learning · MoDELS · 多樣性 · 未標記 ·

2023 年 3 月 22 日

Re-thinking Federated Active Learning based on Inter-class Diversity

SangMook Kim,Sangmin Bae,Hwanjun Song,Se-Young Yun

from arxiv, CVPR 2023

Although federated learning has made awe-inspiring advances, most studies have assumed that the client's data are fully labeled. However, in a real-world scenario, every client may have a significant amount of unlabeled instances. Among the various approaches to utilizing unlabeled data, a federated active learning framework has emerged as a promising solution. In the decentralized setting, there are two types of available query selector models, namely 'global' and 'local-only' models, but little literature discusses their performance dominance and its causes. In this work, we first demonstrate that the superiority of two selector models depends on the global and local inter-class diversity. Furthermore, we observe that the global and local-only models are the keys to resolving the imbalance of each side. Based on our findings, we propose LoGo, a FAL sampling strategy robust to varying local heterogeneity levels and global imbalance ratio, that integrates both models by two steps of active selection scheme. LoGo consistently outperforms six active learning strategies in the total number of 38 experimental settings.

卡爾曼濾波 · 集成 · 線性的 · INFORMS · 蒙特卡羅 ·

2023 年 3 月 22 日

Ensemble Kalman Filter with perturbed observations in weather forecasting and data assimilation

Yihua Yang

from arxiv, Tutor's name is in the footer. She does not feel happy about it. So I decide to withdraw the article

Data assimilation provides algorithms for widespread applications in various fields. It is of practical use to deal with a large amount of information in the complex system that is hard to estimate. Weather forecasting is one of the applications, where the prediction of meteorological data are corrected given the observations. Numerous approaches are contained in data assimilation. One specific sequential method is the Kalman Filter. The core is to estimate unknown information with the new data that is measured and the prior data that is predicted. As a matter of fact, there are different improved methods in the Kalman Filter. In this project, the Ensemble Kalman Filter with perturbed observations is considered. It is achieved by Monte Carlo simulation. In this method, the ensemble is involved in the calculation instead of the state vectors. In addition, the measurement with perturbation is viewed as the suitable observation. These changes compared with the Linear Kalman Filter make it more advantageous in that applications are not restricted in linear systems any more and less time is taken when the data are calculated by computers. The thesis seeks to develop the Ensemble Kalman Filter with perturbed observation gradually. With the Mathematical preliminaries including the introduction of dynamical systems, the Linear Kalman Filter is built. Meanwhile, the prediction and analysis processes are derived. After that, we use the analogy thoughts to lead in the non-linear Ensemble Kalman Filter with perturbed observations. Lastly, a classic Lorenz 63 model is illustrated by MATLAB. In the example, we experiment on the number of ensemble members and seek to investigate the relationships between the error of variance and the number of ensemble members. We reach the conclusion that on a limited scale the larger number of ensemble members indicates the smaller error of prediction.

Learning · 機器人 · tuning · 納什均衡 · 圖 ·

2023 年 3 月 20 日

D3G: Learning Multi-robot Coordination from Demonstrations

Xuan Wang,Yizhi Zhou,Wanxin Jin

This paper develops a Distributed Differentiable Dynamic Game (D3G) framework, which enables learning multi-robot coordination from demonstrations. We represent multi-robot coordination as a dynamic game, where the behavior of a robot is dictated by its own dynamics and objective that also depends on others' behavior. The coordination thus can be adapted by tuning the objective and dynamics of each robot. The proposed D3G enables each robot to automatically tune its individual dynamics and objectives in a distributed manner by minimizing the mismatch between its trajectory and demonstrations. This learning framework features a new design, including a forward-pass, where all robots collaboratively seek Nash equilibrium of a game, and a backward-pass, where gradients are propagated via the communication graph. We test the D3G in simulation with two types of robots given different task configurations. The results validate the capability of D3G for learning multi-robot coordination from demonstrations.

Learning · Automator · INTERACT · 回合 · Agent ·

2023 年 3 月 20 日

A Survey of Demonstration Learning

André Correia,Luís A. Alexandre

from arxiv, 35 pages, 9 figures

With the fast improvement of machine learning, reinforcement learning (RL) has been used to automate human tasks in different areas. However, training such agents is difficult and restricted to expert users. Moreover, it is mostly limited to simulation environments due to the high cost and safety concerns of interactions in the real world. Demonstration Learning is a paradigm in which an agent learns to perform a task by imitating the behavior of an expert shown in demonstrations. It is a relatively recent area in machine learning, but it is gaining significant traction due to having tremendous potential for learning complex behaviors from demonstrations. Learning from demonstration accelerates the learning process by improving sample efficiency, while also reducing the effort of the programmer. Due to learning without interacting with the environment, demonstration learning would allow the automation of a wide range of real world applications such as robotics and healthcare. This paper provides a survey of demonstration learning, where we formally introduce the demonstration problem along with its main challenges and provide a comprehensive overview of the process of learning from demonstrations from the creation of the demonstration data set, to learning methods from demonstrations, and optimization by combining demonstration learning with different machine learning methods. We also review the existing benchmarks and identify their strengths and limitations. Additionally, we discuss the advantages and disadvantages of the paradigm as well as its main applications. Lastly, we discuss our perspective on open problems and research directions for this rapidly growing field.

Learning · 類別 · domain shift · MoDELS · HTTPS ·

2023 年 3 月 19 日

Bi-directional Distribution Alignment for Transductive Zero-Shot Learning

Zhicai Wang,Yanbin Hao,Tingting Mu,Ouxiang Li,Shuo Wang,Xiangnan He

from arxiv, CVPR2023

It is well-known that zero-shot learning (ZSL) can suffer severely from the problem of domain shift, where the true and learned data distributions for the unseen classes do not match. Although transductive ZSL (TZSL) attempts to improve this by allowing the use of unlabelled examples from the unseen classes, there is still a high level of distribution shift. We propose a novel TZSL model (named as Bi-VAEGAN), which largely improves the shift by a strengthened distribution alignment between the visual and auxiliary spaces. The key proposal of the model design includes (1) a bi-directional distribution alignment, (2) a simple but effective L_2-norm based feature normalization approach, and (3) a more sophisticated unseen class prior estimation approach. In benchmark evaluation using four datasets, Bi-VAEGAN achieves the new state of the arts under both the standard and generalized TZSL settings. Code could be found at //github.com/Zhicaiwww/Bi-VAEGAN

狀態估計 · 估計/估計量 · 優化器 · Networking · 傳感器 ·

2023 年 3 月 19 日

Distributed Optimization in Sensor Network for Scalable Multi-Robot Relative State Estimation

Tianyue Wu,Fei Gao

This paper is dedicated to achieving scalable relative state estimation using inter-robot Euclidean distance measurements. We consider equipping robots with distance sensors and focus on the optimization problem underlying relative state estimation in this setup. We reveal the commonality between this problem and the coordinates realization problem of a sensor network. Based on this insight, we propose an effective unconstrained optimization model to infer the relative states among robots. To work on this model in a distributed manner, we propose an efficient and scalable optimization algorithm with the classical block coordinate descent method as its backbone. This algorithm exactly solves each block update subproblem with a closed-form solution while ensuring convergence. Our results pave the way for distance measurements-based relative state estimation in large-scale multi-robot systems.

能量函數 · 控制器 · 泛函 · 可行 · 機器人 ·

2023 年 3 月 17 日

Zero-shot Transferable and Persistently Feasible Safe Control for High Dimensional Systems by Consistent Abstraction

Tianhao Wei,Shucheng Kang,Ruixuan Liu,Changliu Liu

Safety is critical in robotic tasks. Energy function based methods have been introduced to address the problem. To ensure safety in the presence of control limits, we need to design an energy function that results in persistently feasible safe control at all system states. However, designing such an energy function for high-dimensional nonlinear systems remains challenging. Considering the fact that there are redundant dynamics in high dimensional systems with respect to the safety specifications, this paper proposes a novel approach called abstract safe control. We propose a system abstraction method that enables the design of energy functions on a low-dimensional model. Then we can synthesize the energy function with respect to the low-dimensional model to ensure persistent feasibility. The resulting safe controller can be directly transferred to other systems with the same abstraction, e.g., when a robot arm holds different tools. The proposed approach is demonstrated on a 7-DoF robot arm (14 states) both in simulation and real-world. Our method always finds feasible control and achieves zero safety violations in 500 trials on 5 different systems.

策略迭代 · Agent · 控制器 · Learning · 值迭代 ·

2023 年 3 月 17 日

A Policy Iteration Approach for Flock Motion Control

Shuzheng Qu,Mohammed Abouheaf,Wail Gueaieb,Davide Spinello

from arxiv, 7 pages, 3 figures

The flocking motion control is concerned with managing the possible conflicts between local and team objectives of multi-agent systems. The overall control process guides the agents while monitoring the flock-cohesiveness and localization. The underlying mechanisms may degrade due to overlooking the unmodeled uncertainties associated with the flock dynamics and formation. On another side, the efficiencies of the various control designs rely on how quickly they can adapt to different dynamic situations in real-time. An online model-free policy iteration mechanism is developed here to guide a flock of agents to follow an independent command generator over a time-varying graph topology. The strength of connectivity between any two agents or the graph edge weight is decided using a position adjacency dependent function. An online recursive least squares approach is adopted to tune the guidance strategies without knowing the dynamics of the agents or those of the command generator. It is compared with another reinforcement learning approach from the literature which is based on a value iteration technique. The simulation results of the policy iteration mechanism revealed fast learning and convergence behaviors with less computational effort.

contrastive · Learning · Taxonomy · INFORMS · 特化 ·

2023 年 3 月 17 日

Contrastive Self-supervised Learning in Recommender Systems: A Survey

Mengyuan Jing,Yanmin Zhu,Tianzi Zang,Ke Wang

Deep learning-based recommender systems have achieved remarkable success in recent years. However, these methods usually heavily rely on labeled data (i.e., user-item interactions), suffering from problems such as data sparsity and cold-start. Self-supervised learning, an emerging paradigm that extracts information from unlabeled data, provides insights into addressing these problems. Specifically, contrastive self-supervised learning, due to its flexibility and promising performance, has attracted considerable interest and recently become a dominant branch in self-supervised learning-based recommendation methods. In this survey, we provide an up-to-date and comprehensive review of current contrastive self-supervised learning-based recommendation methods. Firstly, we propose a unified framework for these methods. We then introduce a taxonomy based on the key components of the framework, including view generation strategy, contrastive task, and contrastive objective. For each component, we provide detailed descriptions and discussions to guide the choice of the appropriate method. Finally, we outline open issues and promising directions for future research.

Learning · Legged Robot · 逼真度 · 回合 · 穩健性 ·

2023 年 3 月 16 日

Residual Physics Learning and System Identification for Sim-to-real Transfer of Policies on Buoyancy Assisted Legged Robots

Nitish Sontakke,Hosik Chae,Sangjoon Lee,Tianle Huang,Dennis W. Hong,Sehoon Ha

The light and soft characteristics of Buoyancy Assisted Lightweight Legged Unit (BALLU) robots have a great potential to provide intrinsically safe interactions in environments involving humans, unlike many heavy and rigid robots. However, their unique and sensitive dynamics impose challenges to obtaining robust control policies in the real world. In this work, we demonstrate robust sim-to-real transfer of control policies on the BALLU robots via system identification and our novel residual physics learning method, Environment Mimic (EnvMimic). First, we model the nonlinear dynamics of the actuators by collecting hardware data and optimizing the simulation parameters. Rather than relying on standard supervised learning formulations, we utilize deep reinforcement learning to train an external force policy to match real-world trajectories, which enables us to model residual physics with greater fidelity. We analyze the improved simulation fidelity by comparing the simulation trajectories against the real-world ones. We finally demonstrate that the improved simulator allows us to learn better walking and turning policies that can be successfully deployed on the hardware of BALLU.