亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Developing autonomous agents that can strategize and cooperate with humans under information asymmetry is challenging without effective communication in natural language. We introduce a shared-control game, where two players collectively control a token in alternating turns to achieve a common objective under incomplete information. We formulate a policy synthesis problem for an autonomous agent in this game with a human as the other player. To solve this problem, we propose a communication-based approach comprising a language module and a planning module. The language module translates natural language messages into and from a finite set of flags, a compact representation defined to capture player intents. The planning module leverages these flags to compute a policy using an asymmetric information-set Monte Carlo tree search with flag exchange algorithm we present. We evaluate the effectiveness of this approach in a testbed based on Gnomes at Night, a search-and-find maze board game. Results of human subject experiments show that communication narrows the information gap between players and enhances human-agent cooperation efficiency with fewer turns.

相關內容

《計算機信息》雜志發表高質量的論文,擴大了運籌學和計算的范圍,尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文,以及描述新的和有用的軟件工具的論文。官網鏈接: · Performer · Learning · Integration · 回合 ·
2024 年 7 月 2 日

In the realm of autonomous agents, ensuring safety and reliability in complex and dynamic environments remains a paramount challenge. Safe reinforcement learning addresses these concerns by introducing safety constraints, but still faces challenges in navigating intricate environments such as complex driving situations. To overcome these challenges, we present the safe constraint reward (Safe CoR) framework, a novel method that utilizes two types of expert demonstrations$\unicode{x2013}$reward expert demonstrations focusing on performance optimization and safe expert demonstrations prioritizing safety. By exploiting a constraint reward (CoR), our framework guides the agent to balance performance goals of reward sum with safety constraints. We test the proposed framework in diverse environments, including the safety gym, metadrive, and the real$\unicode{x2013}$world Jackal platform. Our proposed framework enhances the performance of algorithms by $39\%$ and reduces constraint violations by $88\%$ on the real-world Jackal platform, demonstrating the framework's efficacy. Through this innovative approach, we expect significant advancements in real-world performance, leading to transformative effects in the realm of safe and reliable autonomous agents.

To efficiently find an optimum parameter combination in a large-scale problem, it is a key to convert the parameters into available variables in actual machines. Specifically, quadratic unconstrained binary optimization problems are solved with the help of machine learning, e.g., factorization machines with annealing, which convert a raw parameter to binary variables. This work investigates the dependence of the convergence speed and the accuracy on binary labeling method, which can influence the cost function shape and thus the probability of being captured at a local minimum solution. By exemplifying traveling salesman problem, we propose and evaluate Gray labeling, which correlates the Hamming distance in binary labels with the traveling distance. Through numerical simulation of traveling salesman problem up to 15 cities at a limited number of iterations, the Gray labeling shows less local minima percentages and shorter traveling distances compared with natural labeling.

Generating safe behaviors for autonomous systems is important as they continue to be deployed in the real world, especially around people. In this work, we focus on developing a novel safe controller for systems where there are multiple sources of uncertainty. We formulate a novel multimodal safe control method, called the Multimodal Safe Set Algorithm (MMSSA) for the case where the agent has uncertainty over which discrete mode the system is in, and each mode itself contains additional uncertainty. To our knowledge, this is the first energy-function-based safe control method applied to systems with multimodal uncertainty. We apply our controller to a simulated human-robot interaction where the robot is uncertain of the human's true intention and each potential intention has its own additional uncertainty associated with it, since the human is not a perfectly rational actor. We compare our proposed safe controller to existing safe control methods and find that it does not impede the system performance (i.e. efficiency) while also improving the safety of the system.

As robots are deployed in human spaces, it is important that they are able to coordinate their actions with the people around them. Part of such coordination involves ensuring that people have a good understanding of how a robot will act in the environment. This can be achieved through explanations of the robot's policy. Much prior work in explainable AI and RL focuses on generating explanations for single-agent policies, but little has been explored in generating explanations for collaborative policies. In this work, we investigate how to generate multi-agent strategy explanations for human-robot collaboration. We formulate the problem using a generic multi-agent planner, show how to generate visual explanations through strategy-conditioned landmark states and generate textual explanations by giving the landmarks to an LLM. Through a user study, we find that when presented with explanations from our proposed framework, users are able to better explore the full space of strategies and collaborate more efficiently with new robot partners.

Learning representative, robust and discriminative information from images is essential for effective person re-identification (Re-Id). In this paper, we propose a compound approach for end-to-end discriminative deep feature learning for person Re-Id based on both body and hand images. We carefully design the Local-Aware Global Attention Network (LAGA-Net), a multi-branch deep network architecture consisting of one branch for spatial attention, one branch for channel attention, one branch for global feature representations and another branch for local feature representations. The attention branches focus on the relevant features of the image while suppressing the irrelevant backgrounds. In order to overcome the weakness of the attention mechanisms, equivariant to pixel shuffling, we integrate relative positional encodings into the spatial attention module to capture the spatial positions of pixels. The global branch intends to preserve the global context or structural information. For the the local branch, which intends to capture the fine-grained information, we perform uniform partitioning to generate stripes on the conv-layer horizontally. We retrieve the parts by conducting a soft partition without explicitly partitioning the images or requiring external cues such as pose estimation. A set of ablation study shows that each component contributes to the increased performance of the LAGA-Net. Extensive evaluations on four popular body-based person Re-Id benchmarks and two publicly available hand datasets demonstrate that our proposed method consistently outperforms existing state-of-the-art methods.

UWB ranging systems have been adopted in many critical and security sensitive applications due to its precise positioning and secure ranging capabilities. We present a practical jamming attack, namely UWBAD, against commercial UWB ranging systems, which exploits the vulnerability of the adoption of the normalized cross-correlation process in UWB ranging and can selectively and quickly block ranging sessions without prior knowledge of the configurations of the victim devices, potentially leading to severe consequences such as property loss, unauthorized access, or vehicle theft. UWBAD achieves more effective and less imperceptible jamming due to: (i) it efficiently blocks every ranging session by leveraging the field-level jamming, thereby exerting a tangible impact on commercial UWB ranging systems, and (ii) the compact, reactive, and selective system design based on COTS UWB chips, making it affordable and less imperceptible. We successfully conducted real attacks against commercial UWB ranging systems from the three largest UWB chip vendors on the market, e.g., Apple, NXP, and Qorvo. We reported our findings to Apple, related Original Equipment Manufacturers (OEM), and the Automotive Security Research Group, triggering internal security incident response procedures at Volkswagen, Audi, Bosch, and NXP. As of the writing of this paper, the related OEM has acknowledged this vulnerability in their automotive systems and has offered a $5,000 reward as a bounty.

The tactile sensation of stroking soft fur, known for its comfort and emotional benefits, has numerous applications in virtual reality, animal-assisted therapy, and household products. Previous studies have primarily utilized actual fur to present a voluminous fur experience that poses challenges concerning versatility and flexibility. In this study, we develop a system that integrates a head-mounted display with an ultrasound haptic display to provide visual and haptic feedback. Measurements taken using an artificial skin sheet reveal directional differences in tactile and visual responses to voluminous fur. Based on observations and measurements, we propose interactive models that dynamically adjust to hand movements, simulating fur-stroking sensations. Our experiments demonstrate that the proposed model using visual and haptic modalities significantly enhances the realism of a fur-stroking experience. Our findings suggest that the interactive visuo-haptic model offers a promising fur-stroking experience in virtual reality, potentially enhancing the user experience in therapeutic, entertainment, and retail applications.

Graphs are important data representations for describing objects and their relationships, which appear in a wide diversity of real-world scenarios. As one of a critical problem in this area, graph generation considers learning the distributions of given graphs and generating more novel graphs. Owing to their wide range of applications, generative models for graphs, which have a rich history, however, are traditionally hand-crafted and only capable of modeling a few statistical properties of graphs. Recent advances in deep generative models for graph generation is an important step towards improving the fidelity of generated graphs and paves the way for new kinds of applications. This article provides an extensive overview of the literature in the field of deep generative models for graph generation. Firstly, the formal definition of deep generative models for the graph generation and the preliminary knowledge are provided. Secondly, taxonomies of deep generative models for both unconditional and conditional graph generation are proposed respectively; the existing works of each are compared and analyzed. After that, an overview of the evaluation metrics in this specific domain is provided. Finally, the applications that deep graph generation enables are summarized and five promising future research directions are highlighted.

Inspired by the human cognitive system, attention is a mechanism that imitates the human cognitive awareness about specific information, amplifying critical details to focus more on the essential aspects of data. Deep learning has employed attention to boost performance for many applications. Interestingly, the same attention design can suit processing different data modalities and can easily be incorporated into large networks. Furthermore, multiple complementary attention mechanisms can be incorporated in one network. Hence, attention techniques have become extremely attractive. However, the literature lacks a comprehensive survey specific to attention techniques to guide researchers in employing attention in their deep models. Note that, besides being demanding in terms of training data and computational resources, transformers only cover a single category in self-attention out of the many categories available. We fill this gap and provide an in-depth survey of 50 attention techniques categorizing them by their most prominent features. We initiate our discussion by introducing the fundamental concepts behind the success of attention mechanism. Next, we furnish some essentials such as the strengths and limitations of each attention category, describe their fundamental building blocks, basic formulations with primary usage, and applications specifically for computer vision. We also discuss the challenges and open questions related to attention mechanism in general. Finally, we recommend possible future research directions for deep attention.

Hyperproperties are commonly used in computer security to define information-flow policies and other requirements that reason about the relationship between multiple computations. In this paper, we study a novel class of hyperproperties where the individual computation paths are chosen by the strategic choices of a coalition of agents in a multi-agent system. We introduce HyperATL*, an extension of computation tree logic with path variables and strategy quantifiers. Our logic can express strategic hyperproperties, such as that the scheduler in a concurrent system has a strategy to avoid information leakage. HyperATL* is particularly useful to specify asynchronous hyperproperties, i.e., hyperproperties where the speed of the execution on the different computation paths depends on the choices of the scheduler. Unlike other recent logics for the specification of asynchronous hyperproperties, our logic is the first to admit decidable model checking for the full logic. We present a model checking algorithm for HyperATL* based on alternating automata, and show that our algorithm is asymptotically optimal by providing a matching lower bound. We have implemented a prototype model checker for a fragment of HyperATL*, able to check various security properties on small programs.

北京阿比特科技有限公司