亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Learning a single universal policy that can perform a diverse set of manipulation tasks is a promising new direction in robotics. However, existing techniques are limited to learning policies that can only perform tasks that are encountered during training, and require a large number of demonstrations to learn new tasks. Humans, on the other hand, often can learn a new task from a single unannotated demonstration. In this work, we propose the Invariance-Matching One-shot Policy Learning (IMOP) algorithm. In contrast to the standard practice of learning the end-effector's pose directly, IMOP first learns invariant regions of the state space for a given task, and then computes the end-effector's pose through matching the invariant regions between demonstrations and test scenes. Trained on the 18 RLBench tasks, IMOP achieves a success rate that outperforms the state-of-the-art consistently, by 4.5% on average over the 18 tasks. More importantly, IMOP can learn a novel task from a single unannotated demonstration, and without any fine-tuning, and achieves an average success rate improvement of $11.5\%$ over the state-of-the-art on 22 novel tasks selected across nine categories. IMOP can also generalize to new shapes and learn to manipulate objects that are different from those in the demonstration. Further, IMOP can perform one-shot sim-to-real transfer using a single real-robot demonstration.

相關內容

To efficiently find an optimum parameter combination in a large-scale problem, it is a key to convert the parameters into available variables in actual machines. Specifically, quadratic unconstrained binary optimization problems are solved with the help of machine learning, e.g., factorization machines with annealing, which convert a raw parameter to binary variables. This work investigates the dependence of the convergence speed and the accuracy on binary labeling method, which can influence the cost function shape and thus the probability of being captured at a local minimum solution. By exemplifying traveling salesman problem, we propose and evaluate Gray labeling, which correlates the Hamming distance in binary labels with the traveling distance. Through numerical simulation of traveling salesman problem up to 15 cities at a limited number of iterations, the Gray labeling shows less local minima percentages and shorter traveling distances compared with natural labeling.

This paper presents a framework that enables robots to automatically recover from assumption violations of high-level specifications during task execution. In contrast to previous methods relying on user intervention to impose additional assumptions for failure recovery, our approach leverages synthesis-based repair to suggest new robot skills that, when implemented, repair the task. Our approach detects violations of environment safety assumptions during the task execution, relaxes the assumptions to admit observed environment behaviors, and acquires new robot skills for task completion. We demonstrate our approach with a Hello Robot Stretch in a factory-like scenario.

With the aim of further enabling the exploitation of intentional impacts in robotic manipulation, a control framework is presented that directly tackles the challenges posed by tracking control of robotic manipulators that are tasked to perform nominally simultaneous impacts. This framework is an extension of the reference spreading control framework, in which overlapping ante- and post-impact references that are consistent with impact dynamics are defined. In this work, such a reference is constructed starting from a teleoperation-based approach. By using the corresponding ante- and post-impact control modes in the scope of a quadratic programming control approach, peaking of the velocity error and control inputs due to impacts is avoided while maintaining high tracking performance. With the inclusion of a novel interim mode, we aim to also avoid input peaks and steps when uncertainty in the environment causes a series of unplanned single impacts to occur rather than the planned simultaneous impact. This work in particular presents for the first time an experimental evaluation of reference spreading control on a robotic setup, showcasing its robustness against uncertainty in the environment compared to three baseline control approaches.

Recently, it has been shown that transformers pre-trained on diverse datasets with multi-episode contexts can generalize to new reinforcement learning tasks in-context. A key limitation of previously proposed models is their reliance on a predefined action space size and structure. The introduction of a new action space often requires data re-collection and model re-training, which can be costly for some applications. In our work, we show that it is possible to mitigate this issue by proposing the Headless-AD model that, despite being trained only once, is capable of generalizing to discrete action spaces of variable size, semantic content and order. By experimenting with Bernoulli and contextual bandits, as well as a gridworld environment, we show that Headless-AD exhibits significant capability to generalize to action spaces it has never encountered, even outperforming specialized models trained for a specific set of actions on several environment configurations. Implementation is available at: //github.com/corl-team/headless-ad.

This paper considers the collaborative graph exploration problem in GPS-denied environments, where a group of robots are required to cover a graph environment while maintaining reliable pose estimations in collaborative simultaneous localization and mapping (SLAM). Considering both objectives presents challenges for multi-robot pathfinding, as it involves the expensive covariance inference for SLAM uncertainty evaluation, especially considering various combinations of robots' paths. To reduce the computational complexity, we propose an efficient two-stage strategy where exploration paths are first generated for quick coverage, and then enhanced by adding informative and distance-efficient loop-closing actions, called loop edges, along the paths for reliable pose estimation. We formulate the latter problem as a non-monotone submodular maximization problem by relating SLAM uncertainty with pose graph topology, which (1) facilitates more efficient evaluation of SLAM uncertainty than covariance inference, and (2) allows the application of approximation algorithms in submodular optimization to provide optimality guarantees. We further introduce the ordering heuristics to improve objective values while preserving the optimality bound. Simulation experiments over randomly generated graph environments verify the efficiency of our methods in finding paths for quick coverage and enhanced pose graph reliability, and benchmark the performance of the approximation algorithms and the greedy-based algorithm in the loop edge selection problem. Our implementations will be open-source at //github.com/bairuofei/CGE.

Collaborative filtering (CF) is an essential technique in recommender systems that provides personalized recommendations by only leveraging user-item interactions. However, most CF methods represent users and items as fixed points in the latent space, lacking the ability to capture uncertainty. While probabilistic embedding is proposed to intergrate uncertainty, they suffer from several limitations when introduced to graph-based recommender systems. Graph convolutional network framework would confuse the semantic of uncertainty in the nodes, and similarity measured by Kullback-Leibler (KL) divergence suffers from degradation problem and demands an exponential number of samples. To address these challenges, we propose a novel approach, called the Wasserstein dependent Graph Attention network (W-GAT), for collaborative filtering with uncertainty. We utilize graph attention network and Wasserstein distance to learn Gaussian embedding for each user and item. Additionally, our method incorporates Wasserstein-dependent mutual information further to increase the similarity between positive pairs. Experimental results on three benchmark datasets show the superiority of W-GAT compared to several representative baselines. Extensive experimental analysis validates the effectiveness of W-GAT in capturing uncertainty by modeling the range of user preferences and categories associated with items.

Measuring an overall autonomy score for a robotic system requires the combination of a set of relevant aspects and features of the system that might be measured in different units, qualitative, and/or discordant. In this paper, we build upon an existing non-contextual autonomy framework that measures and combines the Autonomy Level and the Component Performance of a system as overall autonomy score. We examine several methods of combining features, showing how some methods find different rankings of the same data, and we employ the weighted product method to resolve this issue. Furthermore, we introduce the non-contextual autonomy coordinate and represent the overall autonomy of a system with an autonomy distance. We apply our method to a set of seven Unmanned Aerial Systems (UAS) and obtain their absolute autonomy score as well as their relative score with respect to the best system.

Modern high-stakes systems, such as healthcare or robotics, often generate vast streaming event sequences. Our goal is to design an efficient, plug-and-play tool to elicit logic tree-based explanations from Large Language Models (LLMs) to provide customized insights into each observed event sequence. Built on the temporal point process model for events, our method employs the likelihood function as a score to evaluate generated logic trees. We propose an amortized Expectation-Maximization (EM) learning framework and treat the logic tree as latent variables. In the E-step, we evaluate the posterior distribution over the latent logic trees using an LLM prior and the likelihood of the observed event sequences. LLM provides a high-quality prior for the latent logic trees, however, since the posterior is built over a discrete combinatorial space, we cannot get the closed-form solution. We propose to generate logic tree samples from the posterior using a learnable GFlowNet, which is a diversity-seeking generator for structured discrete variables. The M-step employs the generated logic rules to approximate marginalization over the posterior, facilitating the learning of model parameters and refining the tunable LLM prior parameters. In the online setting, our locally built, lightweight model will iteratively extract the most relevant rules from LLMs for each sequence using only a few iterations. Empirical demonstrations showcase the promising performance and adaptability of our framework.

With the rise of powerful pre-trained vision-language models like CLIP, it becomes essential to investigate ways to adapt these models to downstream datasets. A recently proposed method named Context Optimization (CoOp) introduces the concept of prompt learning -- a recent trend in NLP -- to the vision domain for adapting pre-trained vision-language models. Specifically, CoOp turns context words in a prompt into a set of learnable vectors and, with only a few labeled images for learning, can achieve huge improvements over intensively-tuned manual prompts. In our study we identify a critical problem of CoOp: the learned context is not generalizable to wider unseen classes within the same dataset, suggesting that CoOp overfits base classes observed during training. To address the problem, we propose Conditional Context Optimization (CoCoOp), which extends CoOp by further learning a lightweight neural network to generate for each image an input-conditional token (vector). Compared to CoOp's static prompts, our dynamic prompts adapt to each instance and are thus less sensitive to class shift. Extensive experiments show that CoCoOp generalizes much better than CoOp to unseen classes, even showing promising transferability beyond a single dataset; and yields stronger domain generalization performance as well. Code is available at //github.com/KaiyangZhou/CoOp.

Temporal relational modeling in video is essential for human action understanding, such as action recognition and action segmentation. Although Graph Convolution Networks (GCNs) have shown promising advantages in relation reasoning on many tasks, it is still a challenge to apply graph convolution networks on long video sequences effectively. The main reason is that large number of nodes (i.e., video frames) makes GCNs hard to capture and model temporal relations in videos. To tackle this problem, in this paper, we introduce an effective GCN module, Dilated Temporal Graph Reasoning Module (DTGRM), designed to model temporal relations and dependencies between video frames at various time spans. In particular, we capture and model temporal relations via constructing multi-level dilated temporal graphs where the nodes represent frames from different moments in video. Moreover, to enhance temporal reasoning ability of the proposed model, an auxiliary self-supervised task is proposed to encourage the dilated temporal graph reasoning module to find and correct wrong temporal relations in videos. Our DTGRM model outperforms state-of-the-art action segmentation models on three challenging datasets: 50Salads, Georgia Tech Egocentric Activities (GTEA), and the Breakfast dataset. The code is available at //github.com/redwang/DTGRM.

北京阿比特科技有限公司