欧美狂野视频一区国产精品_精品国产免费第一区二区三区麻豆_精品人妻视频一区二区三区_国产AV无码专区亚洲AV制服_一区二区三区免费观看在线视频播放_黑人巨大VS苍井空_欧美一级牲交在线

Autonomous robots operating in complex environments face the critical challenge of identifying and utilizing environmental cover for covert navigation to minimize exposure to potential threats. We propose EnCoMP, an enhanced navigation framework that integrates offline reinforcement learning and our novel Adaptive Threat-Aware Visibility Estimation (ATAVE) algorithm to enable robots to navigate covertly and efficiently in diverse outdoor settings. ATAVE is a dynamic probabilistic threat modeling technique that we designed to continuously assess and mitigate potential threats in real-time, enhancing the robot's ability to navigate covertly by adapting to evolving environmental and threat conditions. Moreover, our approach generates high-fidelity multi-map representations, including cover maps, potential threat maps, height maps, and goal maps from LiDAR point clouds, providing a comprehensive understanding of the environment. These multi-maps offer detailed environmental insights, helping in strategic navigation decisions. The goal map encodes the relative distance and direction to the target location, guiding the robot's navigation. We train a Conservative Q-Learning (CQL) model on a large-scale dataset collected from real-world environments, learning a robust policy that maximizes cover utilization, minimizes threat exposure, and maintains efficient navigation. We demonstrate our method's capabilities on a physical Jackal robot, showing extensive experiments across diverse terrains. These experiments demonstrate EnCoMP's superior performance compared to state-of-the-art methods, achieving a 95% success rate, 85% cover utilization, and reducing threat exposure to 10.5%, while significantly outperforming baselines in navigation efficiency and robustness.

相關內容

回合

關注 3

機器人 · 離散化 · MoDELS · Continuity · 操作 ·

2024 年 7 月 9 日

Computational Power of Mobile Robots in Synchronous Environment: Discrete Version

Avisek Sharma,Pritam Goswami,Buddhadeb Sau

In distributed computing by mobile robots, robots are deployed over a region, continuous or discrete, operating through a sequence of \textit{look-compute-move} cycles. An extensive study has been carried out to understand the computational powers of different robot models. The models vary on the ability to 1)~remember constant size information and 2)~communicate constant size message. Depending on the abilities the different models are 1)~$\mathcal{OBLOT}$ (robots are oblivious and silent), 2)~$\mathcal{FSTA}$ (robots have finite states but silent), 3)~$\mathcal{FCOM}$ (robots are oblivious but can communicate constant size information) and, 4)~$\mathcal{LUMI}$ (robots have finite states and can communicate constant size information). Another factor that affects computational ability is the scheduler that decides the activation time of the robots. The main three schedulers are \textit{fully-synchronous}, \textit{semi-synchronous} and \textit{asynchronous}. Combining the models ($M$) with schedulers ($K$), we have twelve combinations $M^K$. In the euclidean domain, the comparisons between these twelve variants have been done in different works for transparent robots, opaque robots, and robots with limited visibility. There is a vacant space for similar works when robots are operating on discrete regions like networks. It demands separate research attention because there have been a series of works where robots operate on different networks, and there is a fundamental difference when robots are operating on a continuous domain versus a discrete domain in terms of robots' movement. This work contributes to filling the space by giving a full comparison table for all models with two synchronous schedulers: fully-synchronous and semi-synchronous.

控制器 · 機器人 · Performer · 泛函 · 約束 ·

2024 年 7 月 8 日

Occlusion-Free Image Based Visual Servoing using Probabilistic Control Barrier Certificates

Yanze Zhang,Yupeng Yang,Wenhao Luo

from arxiv, 7 pages, accepted to IFAC 2023 World Congress

Image-based visual servoing (IBVS) is a widely-used approach in robotics that employs visual information to guide robots towards desired positions. However, occlusions in this approach can lead to visual servoing failure and degrade the control performance due to the obstructed vision feature points that are essential for providing visual feedback. In this paper, we propose a Control Barrier Function (CBF) based controller that enables occlusion-free IBVS tasks by automatically adjusting the robot's configuration to keep the feature points in the field of view and away from obstacles. In particular, to account for measurement noise of the feature points, we develop the Probabilistic Control Barrier Certificates (PrCBC) using control barrier functions that encode the chance-constrained occlusion avoidance constraints under uncertainty into deterministic admissible control space for the robot, from which the resulting configuration of robot ensures that the feature points stay occlusion free from obstacles with a satisfying predefined probability. By integrating such constraints with a Model Predictive Control (MPC) framework, the sequence of optimized control inputs can be derived to achieve the primary IBVS task while enforcing the occlusion avoidance during robot movements. Simulation results are provided to validate the performance of our proposed method.

TR · 語言模型化 · MoDELS · 大語言模型 · 數據集 ·

2024 年 7 月 7 日

LTLBench: Towards Benchmarks for Evaluating Temporal Logic Reasoning in Large Language Models

Weizhi Tang,Vaishak Belle

Temporal reasoning (TR) is a critical component of artificial intelligence, encompassing understanding and processing temporal information and relationships between events. To discover and study the TR ability in Large Language Models (LLMs), various datasets have been constructed in different ways for evaluating various aspects of TR ability. Our work proposes a novel approach to design and develop a pipeline for constructing datasets to evaluate the TR ability of LLMs by leveraging random directed graph generation, LTL formula, and the NuSMV model checker. Based on the pipeline, we have also constructed a dataset as a benchmark, namely LTLBench, consisting of 2,000 TR challenges and evaluated six LLMs with it. Furthermore, we have conducted additional experiments to discover the impact of increasing the number of events and formula operators on the complexity of TR problems and the performance of LLMs. We have demonstrated that although LLMs exhibit some promise in handling TR challenges, they still struggle with complex TR. We expect this work can offer insights into TR ability in LLMs while also providing a valuable tool for future TR evaluations.

2024 年 7 月 7 日

Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis

Stefan Horoi,Albert Manuel Orozco Camacho,Eugene Belilovsky,Guy Wolf

from arxiv, Proceedings of the Forty-first International Conference on Machine Learning (ICML 2024)

Combining the predictions of multiple trained models through ensembling is generally a good way to improve accuracy by leveraging the different learned features of the models, however it comes with high computational and storage costs. Model fusion, the act of merging multiple models into one by combining their parameters reduces these costs but doesn't work as well in practice. Indeed, neural network loss landscapes are high-dimensional and non-convex and the minima found through learning are typically separated by high loss barriers. Numerous recent works have been focused on finding permutations matching one network features to the features of a second one, lowering the loss barrier on the linear path between them in parameter space. However, permutations are restrictive since they assume a one-to-one mapping between the different models' neurons exists. We propose a new model merging algorithm, CCA Merge, which is based on Canonical Correlation Analysis and aims to maximize the correlations between linear combinations of the model features. We show that our alignment method leads to better performances than past methods when averaging models trained on the same, or differing data splits. We also extend this analysis into the harder setting where more than 2 models are merged, and we find that CCA Merge works significantly better than past methods. Our code is publicly available at //github.com/shoroi/align-n-merge

可穿戴設備 · 機器人 · INTERACT · Pattern Recognition · 模型評估 ·

2024 年 7 月 5 日

MoveTouch: Robotic Motion Capturing System with Wearable Tactile Display to Achieve Safe HRI

Ali Alabbas,Miguel Altamirano Cabrera,Mohamed Sayed,Oussama Alyounes,Qian Liu,Dzmitry Tsetserukou

from arxiv, 14 pages, Eurohaptics 2024

The collaborative robot market is flourishing as there is a trend towards simplification, modularity, and increased flexibility on the production line. But when humans and robots are collaborating in a shared environment, the safety of humans should be a priority. We introduce a novel wearable robotic system to enhance safety during Human-Robot Interaction (HRI). The proposed wearable robot is designed to hold a fiducial marker and maintain its visibility to a motion capture system, which, in turn, localizes the user's hand with good accuracy and low latency and provides vibrotactile feedback to the user's wrist. The vibrotactile feedback guides the user's hand movement during collaborative tasks in order to increase safety and enhance collaboration efficiency. A user study was conducted to assess the recognition and discriminability of ten designed vibration patterns applied to the upper (dorsal) and the down (volar) parts of the user's wrist. The results show that the pattern recognition rate on the volar side was higher, with an average of 75.64% among all users. Four patterns with a high recognition rate were chosen to be incorporated into our system. A second experiment was carried out to evaluate users' response to the chosen patterns in real-world collaborative tasks. Results show that all participants responded to the patterns correctly, and the average response time for the patterns was between 0.24 and 2.41 seconds.

任務對話系統 · INTERACT · Performer · Processing（編程語言） · Chatbot ·

2024 年 7 月 4 日

Stephanie: Step-by-Step Dialogues for Mimicking Human Interactions in Social Conversations

Hao Yang,Hongyuan Lu,Xinhua Zeng,Yang Liu,Xiang Zhang,Haoran Yang,Yumeng Zhang,Yiran Wei,Wai Lam

In the rapidly evolving field of natural language processing, dialogue systems primarily employ a single-step dialogue paradigm. Although this paradigm is efficient, it lacks the depth and fluidity of human interactions and does not appear natural. We introduce a novel \textbf{Step}-by-Step Dialogue Paradigm (Stephanie), designed to mimic the ongoing dynamic nature of human conversations. By employing a dual learning strategy and a further-split post-editing method, we generated and utilized a high-quality step-by-step dialogue dataset to fine-tune existing large language models, enabling them to perform step-by-step dialogues. We thoroughly present Stephanie. Tailored automatic and human evaluations are conducted to assess its effectiveness compared to the traditional single-step dialogue paradigm. We will release code, Stephanie datasets, and Stephanie LLMs to facilitate the future of chatbot eras.

Performer · 中央處理器 (CPU) · 預測器/決策函數 · 數據集 · MoDELS ·

2024 年 7 月 3 日

NCPP: Nova CPU Performance Predictor on a Novel Dataset

Xiaoman Liu

from arxiv, 27 pages with 10 figures

CPU performance prediction, which involves forecasting the performance scores of a CPU based on its hardware characteristics during its operation, is a critical technology for computational system design and resource management in the big data era. However, this research field currently faces two significant challenges. First, collecting real-world data is challenging due to the wide variety of CPU products on the market and the highly specialized nature of relevant hardware characteristics. In the research process, this field lacks a standard dataset with unified hardware characteristics, wide data coverage, and comprehensive benchmarks. Second, existing methods based on hardware simulation models or machine learning exhibit notable shortcomings, such as lengthy simulation test cycles and low prediction accuracy. To bridge these gaps, we first collect, preprocess, and standardize historical data from the 4th Generation Intel Xeon Scalable Processors across multiple benchmark suites to create a new dataset, named PerfCastDB. Subsequently, we design a deep learning based model called Nova CPU Performance Predictor (NCPP) as the baseline for this new dataset. The NCPP network is designed based on group attention mechanism. It effectively quantifies the implicit relationships between hardware characteristics within and across groups and comprehensively models the impact of various hardware characteristics on CPU performance prediction. We conduct comparative experiments using the proposed PerfCastDB dataset. Compared to existing approaches, NCPP achieves superior evaluation results, demonstrating its effectiveness. Furthermore, we have open-sourced part of the dataset and the NCPP network code to facilitate subsequent research. The resources can be accessed at //github.com/xiaoman-liu/NCPP.

控制器 · MoDELS · 去噪 · AIM · HTTPS ·

2024 年 3 月 7 日

Controllable Generation with Text-to-Image Diffusion Models: A Survey

Pu Cao,Feng Zhou,Qing Song,Lu Yang

from arxiv, A collection of resources on controllable generation with text-to-image diffusion models: //github.com/PRIV-Creation/Awesome-Controllable-T2I-Diffusion-Models

In the rapidly advancing realm of visual generation, diffusion models have revolutionized the landscape, marking a significant shift in capabilities with their impressive text-guided generative functions. However, relying solely on text for conditioning these models does not fully cater to the varied and complex requirements of different applications and scenarios. Acknowledging this shortfall, a variety of studies aim to control pre-trained text-to-image (T2I) models to support novel conditions. In this survey, we undertake a thorough review of the literature on controllable generation with T2I diffusion models, covering both the theoretical foundations and practical advancements in this domain. Our review begins with a brief introduction to the basics of denoising diffusion probabilistic models (DDPMs) and widely used T2I diffusion models. We then reveal the controlling mechanisms of diffusion models, theoretically analyzing how novel conditions are introduced into the denoising process for conditional generation. Additionally, we offer a detailed overview of research in this area, organizing it into distinct categories from the condition perspective: generation with specific conditions, generation with multiple conditions, and universal controllable generation. For an exhaustive list of the controllable generation literature surveyed, please refer to our curated repository at \url{//github.com/PRIV-Creation/Awesome-Controllable-T2I-Diffusion-Models}.

可理解性 · MoDELS · HTTPS · 語言模型化 · TOOLS ·

2023 年 12 月 29 日

Video Understanding with Large Language Models: A Survey

Yunlong Tang,Jing Bi,Siting Xu,Luchuan Song,Susan Liang,Teng Wang,Daoan Zhang,Jie An,Jingyang Lin,Rongyi Zhu,Ali Vosoughi,Chao Huang,Zeliang Zhang,Feng Zheng,Jianguo Zhang,Ping Luo,Jiebo Luo,Chenliang Xu

With the burgeoning growth of online video platforms and the escalating volume of video content, the demand for proficient video understanding tools has intensified markedly. With Large Language Models (LLMs) showcasing remarkable capabilities in key language tasks, this survey provides a detailed overview of the recent advancements in video understanding harnessing the power of LLMs (Vid-LLMs). The emergent capabilities of Vid-LLMs are surprisingly advanced, particularly their ability for open-ended spatial-temporal reasoning combined with commonsense knowledge, suggesting a promising path for future video understanding. We examine the unique characteristics and capabilities of Vid-LLMs, categorizing the approaches into four main types: LLM-based Video Agents, Vid-LLMs Pretraining, Vid-LLMs Instruction Tuning, and Hybrid Methods. Furthermore, this survey also presents a comprehensive study of the tasks and datasets for Vid-LLMs, along with the methodologies employed for evaluation. Additionally, the survey explores the expansive applications of Vid-LLMs across various domains, thereby showcasing their remarkable scalability and versatility in addressing challenges in real-world video understanding. Finally, the survey summarizes the limitations of existing Vid-LLMs and the directions for future research. For more information, we recommend readers visit the repository at //github.com/yunlong10/Awesome-LLMs-for-Video-Understanding.

Cognition · Performer · Agent · 知識 (knowledge) · MoDELS ·

2023 年 7 月 14 日

Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration

Zhenhailong Wang,Shaoguang Mao,Wenshan Wu,Tao Ge,Furu Wei,Heng Ji

from arxiv, work in progress

Human intelligence thrives on the concept of cognitive synergy, where collaboration and information integration among different cognitive processes yield superior outcomes compared to individual cognitive processes in isolation. Although Large Language Models (LLMs) have demonstrated promising performance as general task-solving agents, they still struggle with tasks that require intensive domain knowledge and complex reasoning. In this work, we propose Solo Performance Prompting (SPP), which transforms a single LLM into a cognitive synergist by engaging in multi-turn self-collaboration with multiple personas. A cognitive synergist refers to an intelligent agent that collaborates with multiple minds, combining their individual strengths and knowledge, to enhance problem-solving and overall performance in complex tasks. By dynamically identifying and simulating different personas based on task inputs, SPP unleashes the potential of cognitive synergy in LLMs. We have discovered that assigning multiple, fine-grained personas in LLMs elicits better problem-solving abilities compared to using a single or fixed number of personas. We evaluate SPP on three challenging tasks: Trivia Creative Writing, Codenames Collaborative, and Logic Grid Puzzle, encompassing both knowledge-intensive and reasoning-intensive types. Unlike previous works, such as Chain-of-Thought, that solely enhance the reasoning abilities in LLMs, SPP effectively elicits internal knowledge acquisition abilities, reduces hallucination, and maintains strong reasoning capabilities. Code, data, and prompts can be found at: //github.com/MikeWangWZHL/Solo-Performance-Prompting.git.