一级a视频免费一区二区_亚洲天堂AV一区二区在线观看_国产男女激情一区二区_日韩精品欧美国产第一页_婷婷中文字幕亚洲一区二区_久久久无码精品亚洲住媒_国产在线一区二区三区麻豆

This paper addresses motion forecasting in multi-agent environments, pivotal for ensuring safety of autonomous vehicles. Traditional as well as recent data-driven marginal trajectory prediction methods struggle to properly learn non-linear agent-to-agent interactions. We present SSL-Interactions that proposes pretext tasks to enhance interaction modeling for trajectory prediction. We introduce four interaction-aware pretext tasks to encapsulate various aspects of agent interactions: range gap prediction, closest distance prediction, direction of movement prediction, and type of interaction prediction. We further propose an approach to curate interaction-heavy scenarios from datasets. This curated data has two advantages: it provides a stronger learning signal to the interaction model, and facilitates generation of pseudo-labels for interaction-centric pretext tasks. We also propose three new metrics specifically designed to evaluate predictions in interactive scenes. Our empirical evaluations indicate SSL-Interactions outperforms state-of-the-art motion forecasting methods quantitatively with up to 8% improvement, and qualitatively, for interaction-heavy scenarios.

相關內容

INTERACT

關注 5

IFIP TC13 Conference on Human-Computer Interaction是人機交互領域的研究者和實踐者展示其工作的重要平臺。多年來，這些會議吸引了來自幾個國家和文化的研究人員。官網鏈接： · 3D · 可約的 · state-of-the-art · FAST ·

2024 年 2 月 27 日

VastGaussian: Vast 3D Gaussians for Large Scene Reconstruction

Jiaqi Lin,Zhihao Li,Xiao Tang,Jianzhuang Liu,Shiyong Liu,Jiayue Liu,Yangdi Lu,Xiaofei Wu,Songcen Xu,Youliang Yan,Wenming Yang

from arxiv, Accepted to CVPR 2024. Project website: //vastgaussian.github.io

Existing NeRF-based methods for large scene reconstruction often have limitations in visual quality and rendering speed. While the recent 3D Gaussian Splatting works well on small-scale and object-centric scenes, scaling it up to large scenes poses challenges due to limited video memory, long optimization time, and noticeable appearance variations. To address these challenges, we present VastGaussian, the first method for high-quality reconstruction and real-time rendering on large scenes based on 3D Gaussian Splatting. We propose a progressive partitioning strategy to divide a large scene into multiple cells, where the training cameras and point cloud are properly distributed with an airspace-aware visibility criterion. These cells are merged into a complete scene after parallel optimization. We also introduce decoupled appearance modeling into the optimization process to reduce appearance variations in the rendered images. Our approach outperforms existing NeRF-based methods and achieves state-of-the-art results on multiple large scene datasets, enabling fast optimization and high-fidelity real-time rendering.

估計/估計量 · LIDAR · MoDELS · Performer · 推斷 ·

2024 年 2 月 27 日

ICP-Flow: LiDAR Scene Flow Estimation with ICP

Yancong Lin,Holger Caesar

from arxiv, The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024

Scene flow characterizes the 3D motion between two LiDAR scans captured by an autonomous vehicle at nearby timesteps. Prevalent methods consider scene flow as point-wise unconstrained flow vectors that can be learned by either large-scale training beforehand or time-consuming optimization at inference. However, these methods do not take into account that objects in autonomous driving often move rigidly. We incorporate this rigid-motion assumption into our design, where the goal is to associate objects over scans and then estimate the locally rigid transformations. We propose ICP-Flow, a learning-free flow estimator. The core of our design is the conventional Iterative Closest Point (ICP) algorithm, which aligns the objects over time and outputs the corresponding rigid transformations. Crucially, to aid ICP, we propose a histogram-based initialization that discovers the most likely translation, thus providing a good starting point for ICP. The complete scene flow is then recovered from the rigid transformations. We outperform state-of-the-art baselines, including supervised models, on the Waymo dataset and perform competitively on Argoverse-v2 and nuScenes. Further, we train a feedforward neural network, supervised by the pseudo labels from our model, and achieve top performance among all models capable of real-time inference. We validate the advantage of our model on scene flow estimation with longer temporal gaps, up to 0.5 seconds where other models fail to deliver meaningful results.

機器人 · 優化器 · 概率密度函數 · Extensibility · Performer ·

2024 年 2 月 26 日

SwarmPRM: Probabilistic Roadmap Motion Planning for Swarm Robotic Systems

Yunze Hu,Xuru Yang,Kangjie Zhou,Qinghang Liu,Kang Ding,Han Gao,Pingping Zhu,Chang Liu

Swarm robotic systems consisting of large-scale cooperative agents hold promise for performing autonomous tasks in diverse fields. However, existing planning strategies for swarm robotic systems often encounter a trade-off between scalability and solution quality. We introduce here SwarmPRM, a hierarchical, highly scalable, computationally efficient, and risk-aware sampling-based motion planning approach for swarm robotic systems, which is asymptotically optimal under mild assumptions. We employ probability density functions (PDFs) to represent the swarm's macroscopic state and utilize optimal mass transport (OMT) theory to measure the swarm's cost to go. A risk-aware Gaussian roadmap is constructed wherein each node encapsulates a distinct PDF and conditional-value-at-risk (CVaR) is employed to assess the collision risk, facilitating the generation of macroscopic PDFs in Wasserstein-GMM space. Extensive simulations demonstrate that the proposed approach outperforms state-of-the-art methods in terms of computational efficiency and the average travelling distance.

Learning · Attention · INFORMS · Pivotal（公司） · HTTPS ·

2024 年 2 月 26 日

LaneSegNet: Map Learning with Lane Segment Perception for Autonomous Driving

Tianyu Li,Peijin Jia,Bangjun Wang,Li Chen,Kun Jiang,Junchi Yan,Hongyang Li

from arxiv, Accepted in ICLR 2024

A map, as crucial information for downstream applications of an autonomous driving system, is usually represented in lanelines or centerlines. However, existing literature on map learning primarily focuses on either detecting geometry-based lanelines or perceiving topology relationships of centerlines. Both of these methods ignore the intrinsic relationship of lanelines and centerlines, that lanelines bind centerlines. While simply predicting both types of lane in one model is mutually excluded in learning objective, we advocate lane segment as a new representation that seamlessly incorporates both geometry and topology information. Thus, we introduce LaneSegNet, the first end-to-end mapping network generating lane segments to obtain a complete representation of the road structure. Our algorithm features two key modifications. One is a lane attention module to capture pivotal region details within the long-range feature space. Another is an identical initialization strategy for reference points, which enhances the learning of positional priors for lane attention. On the OpenLane-V2 dataset, LaneSegNet outperforms previous counterparts by a substantial gain across three tasks, \textit{i.e.}, map element detection (+4.8 mAP), centerline perception (+6.9 DET$_l$), and the newly defined one, lane segment perception (+5.6 mAP). Furthermore, it obtains a real-time inference speed of 14.7 FPS. Code is accessible at //github.com/OpenDriveLab/LaneSegNet.

可約的 · 回合 · 路徑 · state-of-the-art · HTTPS ·

2024 年 2 月 26 日

Star-Searcher: A Complete and Efficient Aerial System for Autonomous Target Search in Complex Unknown Environments

Yiming Luo,Zixuan Zhuang,Neng Pan,Chen Feng,Shaojie Shen,Fei Gao,Hui Cheng,Boyu Zhou

from arxiv, Submitted to IEEE RA-L. Code: //github.com/SYSU-STAR/STAR-Searcher. Video: //www.youtube.com/watch?v=08ll_oo_DtU

This paper tackles the challenge of autonomous target search using unmanned aerial vehicles (UAVs) in complex unknown environments. To fill the gap in systematic approaches for this task, we introduce Star-Searcher, an aerial system featuring specialized sensor suites, mapping, and planning modules to optimize searching. Path planning challenges due to increased inspection requirements are addressed through a hierarchical planner with a visibility-based viewpoint clustering method. This simplifies planning by breaking it into global and local sub-problems, ensuring efficient global and local path coverage in real-time. Furthermore, our global path planning employs a history-aware mechanism to reduce motion inconsistency from frequent map changes, significantly enhancing search efficiency. We conduct comparisons with state-of-the-art methods in both simulation and the real world, demonstrating shorter flight paths, reduced time, and higher target search completeness. Our approach will be open-sourced for community benefit at //github.com/SYSU-STAR/STAR-Searcher.

Integration · Automator · Continuity · Java · 論文 ·

2024 年 2 月 24 日

Advancing BDD Software Testing: Dynamic Scenario Re-Usability And Step Auto-Complete For Cucumber Framework

A. H. Mughal

from arxiv, 15 pages, 1 figure, multiple code segments

This paper presents and implements the re-usability of scenarios within scenarios for behavior-driven development (BDD) Gherkin test scripts in the Cucumber Java framework. Though the focus of the presented work is on scenario re-usability through an implementation within the Cucumber BDD Java framework, the paper also dives a little into the limitations of Cucumber single-threaded scenario execution model. This implementation increases the modularity and efficiency of the test suite. The paper also discusses VSCode step definition auto-completion integration, simplifying the test script writing process. This functionality is handy to Quality Assurance(QA) test writers, allowing instant access to relevant step definitions. In addition, the use of these methods in a popular continuous integration and delivery platform Jenkins as a Maven Java project is discussed. This integration with Jenkins, facilitates for more efficient test automation for continuous deployment scenarios. Empirical research and practical applications reveal significant improvements in the speed and efficiency of test writing, which is especially valuable for large and complex software projects. Integrating these methods into traditional sequential BDD practices paves the way towards more effective, efficient, and sustainable test automation strategies.

查準率/準確率 · 點云 · 目標檢測 · 3D · Pivotal（公司） ·

2024 年 2 月 24 日

RaTrack: Moving Object Detection and Tracking with 4D Radar Point Cloud

Zhijun Pan,Fangqiang Ding,Hantao Zhong,Chris Xiaoxuan Lu

from arxiv, Accepted to ICRA 2024. 8 pages, 4 figures. Co-first authorship for Zhijun Pan, Fangqiang Ding and Hantao Zhong, listed randomly

Mobile autonomy relies on the precise perception of dynamic environments. Robustly tracking moving objects in 3D world thus plays a pivotal role for applications like trajectory prediction, obstacle avoidance, and path planning. While most current methods utilize LiDARs or cameras for Multiple Object Tracking (MOT), the capabilities of 4D imaging radars remain largely unexplored. Recognizing the challenges posed by radar noise and point sparsity in 4D radar data, we introduce RaTrack, an innovative solution tailored for radar-based tracking. Bypassing the typical reliance on specific object types and 3D bounding boxes, our method focuses on motion segmentation and clustering, enriched by a motion estimation module. Evaluated on the View-of-Delft dataset, RaTrack showcases superior tracking precision of moving objects, largely surpassing the performance of the state of the art. We release our code and model at //github.com/LJacksonPan/RaTrack.

3D · 通道 · 目標檢測 · INFORMS · Processing（編程語言） ·

2024 年 2 月 23 日

EMIFF: Enhanced Multi-scale Image Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection

Zhe Wang,Siqi Fan,Xiaoliang Huo,Tongda Xu,Yan Wang,Jingjing Liu,Yilun Chen,Ya-Qin Zhang

from arxiv, 7 pages, 8 figures. Accepted by ICRA 2024. arXiv admin note: text overlap with arXiv:arXiv:2303.10975

In autonomous driving, cooperative perception makes use of multi-view cameras from both vehicles and infrastructure, providing a global vantage point with rich semantic context of road conditions beyond a single vehicle viewpoint. Currently, two major challenges persist in vehicle-infrastructure cooperative 3D (VIC3D) object detection: $1)$ inherent pose errors when fusing multi-view images, caused by time asynchrony across cameras; $2)$ information loss in transmission process resulted from limited communication bandwidth. To address these issues, we propose a novel camera-based 3D detection framework for VIC3D task, Enhanced Multi-scale Image Feature Fusion (EMIFF). To fully exploit holistic perspectives from both vehicles and infrastructure, we propose Multi-scale Cross Attention (MCA) and Camera-aware Channel Masking (CCM) modules to enhance infrastructure and vehicle features at scale, spatial, and channel levels to correct the pose error introduced by camera asynchrony. We also introduce a Feature Compression (FC) module with channel and spatial compression blocks for transmission efficiency. Experiments show that EMIFF achieves SOTA on DAIR-V2X-C datasets, significantly outperforming previous early-fusion and late-fusion methods with comparable transmission costs.

變換 · Extensibility · INFORMS · Performer · MoDELS ·

2020 年 12 月 17 日

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Haoyi Zhou,Shanghang Zhang,Jieqi Peng,Shuai Zhang,Jianxin Li,Hui Xiong,Wancai Zhang

from arxiv, 7 pages (main), 5 pages (appendix) and to be appeared in AAAI2021

Many real-world applications require the prediction of long sequence time-series, such as electricity consumption planning. Long sequence time-series forecasting (LSTF) demands a high prediction capacity of the model, which is the ability to capture precise long-range dependency coupling between output and input efficiently. Recent studies have shown the potential of Transformer to increase the prediction capacity. However, there are several severe issues with Transformer that prevent it from being directly applicable to LSTF, such as quadratic time complexity, high memory usage, and inherent limitation of the encoder-decoder architecture. To address these issues, we design an efficient transformer-based model for LSTF, named Informer, with three distinctive characteristics: (i) a $ProbSparse$ Self-attention mechanism, which achieves $O(L \log L)$ in time complexity and memory usage, and has comparable performance on sequences' dependency alignment. (ii) the self-attention distilling highlights dominating attention by halving cascading layer input, and efficiently handles extreme long input sequences. (iii) the generative style decoder, while conceptually simple, predicts the long time-series sequences at one forward operation rather than a step-by-step way, which drastically improves the inference speed of long-sequence predictions. Extensive experiments on four large-scale datasets demonstrate that Informer significantly outperforms existing methods and provides a new solution to the LSTF problem.

Performer · 判別器 · 正例 · 假陽性 · 監督 ·

2018 年 5 月 24 日

DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction

Pengda Qin,Weiran Xu,William Yang Wang

Distant supervision can effectively label data for relation extraction, but suffers from the noise labeling problem. Recent works mainly perform soft bag-level noise reduction strategies to find the relatively better samples in a sentence bag, which is suboptimal compared with making a hard decision of false positive samples in sentence level. In this paper, we introduce an adversarial learning framework, which we named DSGAN, to learn a sentence-level true-positive generator. Inspired by Generative Adversarial Networks, we regard the positive samples generated by the generator as the negative samples to train the discriminator. The optimal generator is obtained until the discrimination ability of the discriminator has the greatest decline. We adopt the generator to filter distant supervision training dataset and redistribute the false positive instances into the negative set, in which way to provide a cleaned dataset for relation classification. The experimental results show that the proposed strategy significantly improves the performance of distant supervision relation extraction comparing to state-of-the-art systems.