两个人的视频免费国语版,99久久国产精品综合久久国产

From industrial to space robotics, safe landing is an essential component for flight operations. With the growing interest in artificial intelligence, we direct our attention to learning based safe landing approaches. This paper extends our previous work, DOVESEI, which focused on a reactive UAV system by harnessing the capabilities of open vocabulary image segmentation. Prompt-based safe landing zone segmentation using an open vocabulary based model is no more just an idea, but proven to be feasible by the work of DOVESEI. However, a heuristic selection of words for prompt is not a reliable solution since it cannot take the changing environment into consideration and detrimental consequences can occur if the observed environment is not well represented by the given prompt. Therefore, we introduce PEACE (Prompt Engineering Automation for CLIPSeg Enhancement), powering DOVESEI to automate the prompt generation and engineering to adapt to data distribution shifts. Our system is capable of performing safe landing operations with collision avoidance at altitudes as low as 20 meters using only monocular cameras and image segmentation. We take advantage of DOVESEI's dynamic focus to circumvent abrupt fluctuations in the terrain segmentation between frames in a video stream. PEACE shows promising improvements in prompt generation and engineering for aerial images compared to the standard prompt used for CLIP and CLIPSeg. Combining DOVESEI and PEACE, our system was able improve successful safe landing zone selections by 58.62% compared to using only DOVESEI. All the source code is open source and available online.

相關內容

Automator

關注 5

Automator是蘋果公司為他們的Mac OS X系統開發的一款軟件。 只要通過點擊拖拽鼠標等操作就可以將一系列動作組合成一個工作流，從而幫助你自動的（可重復的）完成一些復雜的工作。Automator還能橫跨很多不同種類的程序，包括：查找器、Safari網絡瀏覽器、iCal、地址簿或者其他的一些程序。它還能和一些第三方的程序一起工作，如微軟的Office、Adobe公司的Photoshop或者Pixelmator等。

秩 · 蒸餾 · 真實值 · Integration · 模型評估 ·

2024 年 1 月 29 日

Endo-4DGS: Distilling Depth Ranking for Endoscopic Monocular Scene Reconstruction with 4D Gaussian Splatting

Yiming Huang,Beilei Cui,Long Bai,Ziqi Guo,Mengya Xu,Hongliang Ren

In the realm of robot-assisted minimally invasive surgery, dynamic scene reconstruction can significantly enhance downstream tasks and improve surgical outcomes. Neural Radiance Fields (NeRF)-based methods have recently risen to prominence for their exceptional ability to reconstruct scenes. Nonetheless, these methods are hampered by slow inference, prolonged training, and substantial computational demands. Additionally, some rely on stereo depth estimation, which is often infeasible due to the high costs and logistical challenges associated with stereo cameras. Moreover, the monocular reconstruction quality for deformable scenes is currently inadequate. To overcome these obstacles, we present Endo-4DGS, an innovative, real-time endoscopic dynamic reconstruction approach that utilizes 4D Gaussian Splatting (GS) and requires no ground truth depth data. This method extends 3D GS by incorporating a temporal component and leverages a lightweight MLP to capture temporal Gaussian deformations. This effectively facilitates the reconstruction of dynamic surgical scenes with variable conditions. We also integrate Depth-Anything to generate pseudo-depth maps from monocular views, enhancing the depth-guided reconstruction process. Our approach has been validated on two surgical datasets, where it has proven to render in real-time, compute efficiently, and reconstruct with remarkable accuracy. These results underline the vast potential of Endo-4DGS to improve surgical assistance.

Networking · 估計/估計量 · 點云 · 解碼 · 門控循環單元 ·

2024 年 1 月 29 日

DeFlow: Decoder of Scene Flow Network in Autonomous Driving

Qingwen Zhang,Yi Yang,Heng Fang,Ruoyu Geng,Patric Jensfelt

from arxiv, 7 pages, 4 figures, Code check //github.com/KTH-RPL/deflow, accepted by ICRA 2024

Scene flow estimation determines a scene's 3D motion field, by predicting the motion of points in the scene, especially for aiding tasks in autonomous driving. Many networks with large-scale point clouds as input use voxelization to create a pseudo-image for real-time running. However, the voxelization process often results in the loss of point-specific features. This gives rise to a challenge in recovering those features for scene flow tasks. Our paper introduces DeFlow which enables a transition from voxel-based features to point features using Gated Recurrent Unit (GRU) refinement. To further enhance scene flow estimation performance, we formulate a novel loss function that accounts for the data imbalance between static and dynamic points. Evaluations on the Argoverse 2 scene flow task reveal that DeFlow achieves state-of-the-art results on large-scale point cloud data, demonstrating that our network has better performance and efficiency compared to others. The code is open-sourced at //github.com/KTH-RPL/deflow.

去噪 · MoDELS · 噪聲 · Processing（編程語言） · 推薦系統 ·

2024 年 1 月 29 日

Plug-In Diffusion Model for Embedding Denoising in Recommendation System

Jujia Zhao,Wenjie Wang,Yiyan Xu,Teng Sun,Fuli Feng

In the realm of recommender systems, handling noisy implicit feedback is a prevalent challenge. While most research efforts focus on mitigating noise through data cleaning methods like resampling and reweighting, these approaches often rely on heuristic assumptions. Alternatively, model perspective denoising strategies actively incorporate noise into user-item interactions, aiming to bolster the model's inherent denoising capabilities. Nonetheless, this type of denoising method presents substantial challenges to the capacity of the recommender model to accurately identify and represent noise patterns. To overcome these hurdles, we introduce a plug-in diffusion model for embedding denoising in recommendation system, which employs a multi-step denoising approach based on diffusion models to foster robust representation learning of embeddings. Our model operates by introducing controlled Gaussian noise into user and item embeddings derived from various recommender systems during the forward phase. Subsequently, it iteratively eliminates this noise in the reverse denoising phase, thereby augmenting the embeddings' resilience to noisy feedback. The primary challenge in this process is determining direction and an optimal starting point for the denoising process. To address this, we incorporate a specialized denoising module that utilizes collaborative data as a guide for the denoising process. Furthermore, during the inference phase, we employ the average of item embeddings previously favored by users as the starting point to facilitate ideal item generation. Our thorough evaluations across three datasets and in conjunction with three classic backend models confirm its superior performance.

SOFT · Learning · 機器人 · 控制器 · INTERACT ·

2024 年 1 月 29 日

DittoGym: Learning to Control Soft Shape-Shifting Robots

Suning Huang,Boyuan Chen,Huazhe Xu,Vincent Sitzmann

Robot co-design, where the morphology of a robot is optimized jointly with a learned policy to solve a specific task, is an emerging area of research. It holds particular promise for soft robots, which are amenable to novel manufacturing techniques that can realize learned morphologies and actuators. Inspired by nature and recent novel robot designs, we propose to go a step further and explore the novel reconfigurable robots, defined as robots that can change their morphology within their lifetime. We formalize control of reconfigurable soft robots as a high-dimensional reinforcement learning (RL) problem. We unify morphology change, locomotion, and environment interaction in the same action space, and introduce an appropriate, coarse-to-fine curriculum that enables us to discover policies that accomplish fine-grained control of the resulting robots. We also introduce DittoGym, a comprehensive RL benchmark for reconfigurable soft robots that require fine-grained morphology changes to accomplish the tasks. Finally, we evaluate our proposed coarse-to-fine algorithm on DittoGym and demonstrate robots that learn to change their morphology several times within a sequence, uniquely enabled by our RL algorithm. More results are available at //dittogym.github.io.

Performer · 相似度 · 監督 · Networking · CASE ·

2024 年 1 月 28 日

ARGOS: An Automaton Referencing Guided Overtake System for Head-to-Head Autonomous Racing

Varundev Sukhil,Madhur Behl

from arxiv, 15 pages

Autonomous overtaking at high speeds is a challenging multi-agent robotics research problem. The high-speed and close proximity situations that arise in multi-agent autonomous racing require designing algorithms that trade off aggressive overtaking maneuvers and minimize the risk of collision with the opponent. In this paper, we study a special case of multi-agent autonomous race, called the head-to-head autonomous race, that requires two racecars with similar performance envelopes. We present a mathematical formulation of an overtake and position defense in this head-to-head autonomous racing scenario, and we introduce the Automaton Referencing Guided Overtake System (ARGOS) framework that supervises the execution of an overtake or position defense maneuver depending on the current role of the racecar. The ARGOS framework works by decomposing complex overtake and position-defense maneuvers into sequential and temporal submaneuvers that are individually managed and supervised by a network of automatons. We verify the properties of the ARGOS framework using model-checking and demonstrate results from multiple simulations, which show that the framework meets the desired specifications. The ARGOS framework performs similar to what can be observed from real-world human-driven motor sport racing.

Integration · Learning · 講稿 · 層 · 優化器 ·

2024 年 1 月 28 日

HappyRouting: Learning Emotion-Aware Route Trajectories for Scalable In-The-Wild Navigation

David Bethge,Daniel Bulanda,Adam Kozlowski,Thomas Kosch,Albrecht Schmidt,Tobias Grosse-Puppendahl

from arxiv, 17 pages

Routes represent an integral part of triggering emotions in drivers. Navigation systems allow users to choose a navigation strategy, such as the fastest or shortest route. However, they do not consider the driver's emotional well-being. We present HappyRouting, a novel navigation-based empathic car interface guiding drivers through real-world traffic while evoking positive emotions. We propose design considerations, derive a technical architecture, and implement a routing optimization framework. Our contribution is a machine learning-based generated emotion map layer, predicting emotions along routes based on static and dynamic contextual data. We evaluated HappyRouting in a real-world driving study (N=13), finding that happy routes increase subjectively perceived valence by 11% (p=.007). Although happy routes take 1.25 times longer on average, participants perceived the happy route as shorter, presenting an emotion-enhanced alternative to today's fastest routing mechanisms. We discuss how emotion-based routing can be integrated into navigation apps, promoting emotional well-being for mobility use.

Learning · 評論員 · 估計/估計量 · Analysis · 模型評估 ·

2024 年 1 月 28 日

Data-Driven Strategies for Coping with Incomplete DVL Measurements

Nadav Cohen,Itzik Klein

Autonomous underwater vehicles are specialized platforms engineered for deep underwater operations. Critical to their functionality is autonomous navigation, typically relying on an inertial navigation system and a Doppler velocity log. In real-world scenarios, incomplete Doppler velocity log measurements occur, resulting in positioning errors and mission aborts. To cope with such situations, a model and learning approaches were derived. This paper presents a comparative analysis of two cutting-edge deep learning methodologies, namely LiBeamsNet and MissBeamNet, alongside a model-based average estimator. These approaches are evaluated for their efficacy in regressing missing Doppler velocity log beams when two beams are unavailable. In our study, we used data recorded by a DVL mounted on an autonomous underwater vehicle operated in the Mediterranean Sea. We found that both deep learning architectures outperformed model-based approaches by over 16% in velocity prediction accuracy.

contrastive · 圖 · 對比學習 · INFORMS · 學成 ·

2021 年 9 月 24 日

GeomGCL: Geometric Graph Contrastive Learning for Molecular Property Prediction

Shuangli Li,Jingbo Zhou,Tong Xu,Dejing Dou,Hui Xiong

Recently many efforts have been devoted to applying graph neural networks (GNNs) to molecular property prediction which is a fundamental task for computational drug and material discovery. One of major obstacles to hinder the successful prediction of molecule property by GNNs is the scarcity of labeled data. Though graph contrastive learning (GCL) methods have achieved extraordinary performance with insufficient labeled data, most focused on designing data augmentation schemes for general graphs. However, the fundamental property of a molecule could be altered with the augmentation method (like random perturbation) on molecular graphs. Whereas, the critical geometric information of molecules remains rarely explored under the current GNN and GCL architectures. To this end, we propose a novel graph contrastive learning method utilizing the geometry of the molecule across 2D and 3D views, which is named GeomGCL. Specifically, we first devise a dual-view geometric message passing network (GeomMPNN) to adaptively leverage the rich information of both 2D and 3D graphs of a molecule. The incorporation of geometric properties at different levels can greatly facilitate the molecular representation learning. Then a novel geometric graph contrastive scheme is designed to make both geometric views collaboratively supervise each other to improve the generalization ability of GeomMPNN. We evaluate GeomGCL on various downstream property prediction tasks via a finetune process. Experimental results on seven real-life molecular datasets demonstrate the effectiveness of our proposed GeomGCL against state-of-the-art baselines.

回合 · AI · CASE · 系統架構 · Engineering ·

2021 年 8 月 30 日

Multi-Agent Simulation for AI Behaviour Discovery in Operations Research

Michael Papasimeon,Lyndon Benke

from arxiv, 14 pages, 7 figures. To be published in proceedings of the 22nd International Workshop on Multi-Agent-Based Simulation (MABS 2021) at AAMAS 2021. //mabsworkshop.github.io/accepted/

We describe ACE0, a lightweight platform for evaluating the suitability and viability of AI methods for behaviour discovery in multiagent simulations. Specifically, ACE0 was designed to explore AI methods for multi-agent simulations used in operations research studies related to new technologies such as autonomous aircraft. Simulation environments used in production are often high-fidelity, complex, require significant domain knowledge and as a result have high R&D costs. Minimal and lightweight simulation environments can help researchers and engineers evaluate the viability of new AI technologies for behaviour discovery in a more agile and potentially cost effective manner. In this paper we describe the motivation for the development of ACE0.We provide a technical overview of the system architecture, describe a case study of behaviour discovery in the aerospace domain, and provide a qualitative evaluation of the system. The evaluation includes a brief description of collaborative research projects with academic partners, exploring different AI behaviour discovery methods.

圖 · 知識圖譜 · 語言模型化 · entity · BERT ·

2019 年 9 月 11 日

KG-BERT: BERT for Knowledge Graph Completion

Liang Yao,Chengsheng Mao,Yuan Luo

Knowledge graphs are important resources for many artificial intelligence tasks but often suffer from incompleteness. In this work, we propose to use pre-trained language models for knowledge graph completion. We treat triples in knowledge graphs as textual sequences and propose a novel framework named Knowledge Graph Bidirectional Encoder Representations from Transformer (KG-BERT) to model these triples. Our method takes entity and relation descriptions of a triple as input and computes scoring function of the triple with the KG-BERT language model. Experimental results on multiple benchmark knowledge graphs show that our method can achieve state-of-the-art performance in triple classification, link prediction and relation prediction tasks.