云南虫谷在线观看免费观看电视剧,国产乱伦对白刺激视频,91A国产黄片精品

Predicting ego vehicle trajectories remains a critical challenge, especially in urban and dense areas due to the unpredictable behaviours of other vehicles and pedestrians. Multimodal trajectory prediction enhances decision-making by considering multiple possible future trajectories based on diverse sources of environmental data. In this approach, we leverage ResNet-50 to extract image features from high-definition map data and use IMU sensor data to calculate speed, acceleration, and yaw rate. A temporal probabilistic network is employed to compute potential trajectories, selecting the most accurate and highly probable trajectory paths. This method integrates HD map data to improve the robustness and reliability of trajectory predictions for autonomous vehicles.

相關內容

多峰值

關注 2

機器人 · 多樣性 · Learning · Notability · 基 ·

2024 年 8 月 21 日

ACE: A Cross-Platform Visual-Exoskeletons System for Low-Cost Dexterous Teleoperation

Shiqi Yang,Minghuan Liu,Yuzhe Qin,Runyu Ding,Jialong Li,Xuxin Cheng,Ruihan Yang,Sha Yi,Xiaolong Wang

from arxiv, Webpage: //ace-teleop.github.io/

Learning from demonstrations has shown to be an effective approach to robotic manipulation, especially with the recently collected large-scale robot data with teleoperation systems. Building an efficient teleoperation system across diverse robot platforms has become more crucial than ever. However, there is a notable lack of cost-effective and user-friendly teleoperation systems for different end-effectors, e.g., anthropomorphic robot hands and grippers, that can operate across multiple platforms. To address this issue, we develop ACE, a cross-platform visual-exoskeleton system for low-cost dexterous teleoperation. Our system utilizes a hand-facing camera to capture 3D hand poses and an exoskeleton mounted on a portable base, enabling accurate real-time capture of both finger and wrist poses. Compared to previous systems, which often require hardware customization according to different robots, our single system can generalize to humanoid hands, arm-hands, arm-gripper, and quadruped-gripper systems with high-precision teleoperation. This enables imitation learning for complex manipulation tasks on diverse platforms.

數據集 · MoDELS · 估計/估計量 · Automator · 可約的 ·

2024 年 8 月 20 日

V-RoAst: A New Dataset for Visual Road Assessment

Natchapon Jongwiriyanurak,Zichao Zeng,June Moh Goo,Xinglei Wang,Ilya Ilyankou,Kerkritt Srirrongvikrai,Meihui Wang,James Haworth

Road traffic crashes cause millions of deaths annually and have a significant economic impact, particularly in low- and middle-income countries (LMICs). This paper presents an approach using Vision Language Models (VLMs) for road safety assessment, overcoming the limitations of traditional Convolutional Neural Networks (CNNs). We introduce a new task ,V-RoAst (Visual question answering for Road Assessment), with a real-world dataset. Our approach optimizes prompt engineering and evaluates advanced VLMs, including Gemini-1.5-flash and GPT-4o-mini. The models effectively examine attributes for road assessment. Using crowdsourced imagery from Mapillary, our scalable solution influentially estimates road safety levels. In addition, this approach is designed for local stakeholders who lack resources, as it does not require training data. It offers a cost-effective and automated methods for global road safety assessments, potentially saving lives and reducing economic burdens.

MoDELS · 圖像修復 · Processing（編程語言） · 數據集 · SOTA ·

2024 年 8 月 17 日

LOID: Lane Occlusion Inpainting and Detection for Enhanced Autonomous Driving Systems

Aayush Agrawal,Ashmitha Jaysi Sivakumar,Ibrahim Kaif,Chayan Banerjee

from arxiv, 8 pages, 6 figures and 4 tables

Accurate lane detection is essential for effective path planning and lane following in autonomous driving, especially in scenarios with significant occlusion from vehicles and pedestrians. Existing models often struggle under such conditions, leading to unreliable navigation and safety risks. We propose two innovative approaches to enhance lane detection in these challenging environments, each showing notable improvements over current methods. The first approach aug-Segment improves conventional lane detection models by augmenting the training dataset of CULanes with simulated occlusions and training a segmentation model. This method achieves a 12% improvement over a number of SOTA models on the CULanes dataset, demonstrating that enriched training data can better handle occlusions, however, since this model lacked robustness to certain settings, our main contribution is the second approach, LOID Lane Occlusion Inpainting and Detection. LOID introduces an advanced lane detection network that uses an image processing pipeline to identify and mask occlusions. It then employs inpainting models to reconstruct the road environment in the occluded areas. The enhanced image is processed by a lane detection algorithm, resulting in a 20% & 24% improvement over several SOTA models on the BDDK100 and CULanes datasets respectively, highlighting the effectiveness of this novel technique.

輸入空間 · 穩健性 · Networking · Neural Networks · 可辨認的 ·

2024 年 8 月 16 日

LEVIS: Large Exact Verifiable Input Spaces for Neural Networks

Mohamad Fares El Hajj Chehade,Brian Wesley Bell,Russell Bent,Hao Zhu,Wenting Li

The robustness of neural networks is paramount in safety-critical applications. While most current robustness verification methods assess the worst-case output under the assumption that the input space is known, identifying a verifiable input space $\mathcal{C}$, where no adversarial examples exist, is crucial for effective model selection, robustness evaluation, and the development of reliable control strategies. To address this challenge, we introduce a novel framework, $\texttt{LEVIS}$, comprising $\texttt{LEVIS}$-$\alpha$ and $\texttt{LEVIS}$-$\beta$. $\texttt{LEVIS}$-$\alpha$ locates the largest possible verifiable ball within the central region of $\mathcal{C}$ that intersects at least two boundaries. In contrast, $\texttt{LEVIS}$-$\beta$ integrates multiple verifiable balls to encapsulate the entirety of the verifiable space comprehensively. Our contributions are threefold: (1) We propose $\texttt{LEVIS}$ equipped with three pioneering techniques that identify the maximum verifiable ball and the nearest adversarial point along collinear or orthogonal directions. (2) We offer a theoretical analysis elucidating the properties of the verifiable balls acquired through $\texttt{LEVIS}$-$\alpha$ and $\texttt{LEVIS}$-$\beta$. (3) We validate our methodology across diverse applications, including electrical power flow regression and image classification, showcasing performance enhancements and visualizations of the searching characteristics.

Learning · 剪枝 · Branch · Performer · 穩健性 ·

2024 年 8 月 15 日

CorrAdaptor: Adaptive Local Context Learning for Correspondence Pruning

Wei Zhu,Yicheng Liu,Yuping He,Tangfei Liao,Kang Zheng,Xiaoqiu Xu,Tao Wang,Tong Lu

from arxiv, 8 pages, 4 figures, accepted by ECAI

In the fields of computer vision and robotics, accurate pixel-level correspondences are essential for enabling advanced tasks such as structure-from-motion and simultaneous localization and mapping. Recent correspondence pruning methods usually focus on learning local consistency through k-nearest neighbors, which makes it difficult to capture robust context for each correspondence. We propose CorrAdaptor, a novel architecture that introduces a dual-branch structure capable of adaptively adjusting local contexts through both explicit and implicit local graph learning. Specifically, the explicit branch uses KNN-based graphs tailored for initial neighborhood identification, while the implicit branch leverages a learnable matrix to softly assign neighbors and adaptively expand the local context scope, significantly enhancing the model's robustness and adaptability to complex image variations. Moreover, we design a motion injection module to integrate motion consistency into the network to suppress the impact of outliers and refine local context learning, resulting in substantial performance improvements. The experimental results on extensive correspondence-based tasks indicate that our CorrAdaptor achieves state-of-the-art performance both qualitatively and quantitatively. The code and pre-trained models are available at //github.com/TaoWangzj/CorrAdaptor.

INTERACT · 協同過濾 · 圖 · SimPLe · 特征提取 ·

2024 年 8 月 11 日

GraphTransfer: A Generic Feature Fusion Framework for Collaborative Filtering

Jiafeng Xia,Dongsheng Li,Hansu Gu,Tun Lu,Ning Gu

Graph Neural Networks (GNNs) have demonstrated effectiveness in collaborative filtering tasks due to their ability to extract powerful structural features. However, combining the graph features extracted from user-item interactions and auxiliary features extracted from user genres and item properties remains a challenge. Currently available fusion methods face two major issues: 1) simple methods such as concatenation and summation are generic, but not accurate in capturing feature relationships; 2) task-specific methods like attention mechanisms and meta paths may not be suitable for general feature fusion. To address these challenges, we present GraphTransfer, a simple but universal feature fusion framework for GNN-based collaborative filtering. Our method accurately fuses different types of features by first extracting graph features from the user-item interaction graph and auxiliary features from users and items using GCN. The proposed cross fusion module then effectively bridges the semantic gaps between the interaction scores of different features. Theoretical analysis and experiments on public datasets show that GraphTransfer outperforms other feature fusion methods in CF tasks. Additionally, we demonstrate the universality of our framework via empirical studies in three other scenarios, showing that GraphTransfer leads to significant improvements in the performance of CF algorithms.

多峰值 · Networking · Automator · MoDELS · INFORMS ·

2024 年 8 月 7 日

HiQuE: Hierarchical Question Embedding Network for Multimodal Depression Detection

Juho Jung,Chaewon Kang,Jeewoo Yoon,Seungbae Kim,Jinyoung Han

from arxiv, 11 pages, 6 figures, Proceedings of the 33rd ACM International Conference on Information and Knowledge Management (CIKM '24)

The utilization of automated depression detection significantly enhances early intervention for individuals experiencing depression. Despite numerous proposals on automated depression detection using recorded clinical interview videos, limited attention has been paid to considering the hierarchical structure of the interview questions. In clinical interviews for diagnosing depression, clinicians use a structured questionnaire that includes routine baseline questions and follow-up questions to assess the interviewee's condition. This paper introduces HiQuE (Hierarchical Question Embedding network), a novel depression detection framework that leverages the hierarchical relationship between primary and follow-up questions in clinical interviews. HiQuE can effectively capture the importance of each question in diagnosing depression by learning mutual information across multiple modalities. We conduct extensive experiments on the widely-used clinical interview data, DAIC-WOZ, where our model outperforms other state-of-the-art multimodal depression detection models and emotion recognition models, showcasing its clinical utility in depression detection.

泛化理論 · Performer · 相互獨立的 · 可辨認的 · 目標領域 ·

2024 年 8 月 7 日

InPer: Whole-Process Domain Generalization via Causal Intervention and Perturbation

Luyao Tang,Yuxuan Yuan,Chaoqi Chen,Xinghao Ding,Yue Huang

from arxiv, Accepted by BMVC2024

Despite the considerable advancements achieved by deep neural networks, their performance tends to degenerate when the test environment diverges from the training ones. Domain generalization (DG) solves this issue by learning representations independent of domain-related information, thus facilitating extrapolation to unseen environments. Existing approaches typically focus on formulating tailored training objectives to extract shared features from the source data. However, the disjointed training and testing procedures may compromise robustness, particularly in the face of unforeseen variations during deployment. In this paper, we propose a novel and holistic framework based on causality, named InPer, designed to enhance model generalization by incorporating causal intervention during training and causal perturbation during testing. Specifically, during the training phase, we employ entropy-based causal intervention (EnIn) to refine the selection of causal variables. To identify samples with anti-interference causal variables from the target domain, we propose a novel metric, homeostatic score, through causal perturbation (HoPer) to construct a prototype classifier in test time. Experimental results across multiple cross-domain tasks confirm the efficacy of InPer.

Automator · Performer · 設計 · 優化器 · 變換 ·

2024 年 8 月 6 日

INSIGHT: Universal Neural Simulator for Analog Circuits Harnessing Autoregressive Transformers

Souradip Poddar,Youngmin Oh,Yao Lai,Hanqing Zhu,Bosun Hwang,David Z. Pan

Analog front-end design heavily relies on specialized human expertise and costly trial-and-error simulations, which motivated many prior works on analog design automation. However, efficient and effective exploration of the vast and complex design space remains constrained by the time-consuming nature of SPICE simulations, making effective design automation a challenging endeavor. In this paper, we introduce INSIGHT, a GPU-powered, technology-agnostic, effective universal neural simulator in the analog front-end design automation loop. INSIGHT accurately predicts the performance metrics of analog circuits across various technologies with just a few microseconds of inference time. Notably, its autoregressive capabilities enable INSIGHT to accurately predict simulation-costly critical transient specifications leveraging less expensive performance metric information. The low cost and high fidelity feature make INSIGHT a good substitute for standard simulators in analog front-end optimization frameworks. INSIGHT is compatible with any optimization framework, facilitating enhanced design space exploration for sample efficiency through sophisticated offline learning and adaptation techniques. Our experiments demonstrate that INSIGHT-M, a model-based batch reinforcement learning sizing framework with INSIGHT as the accurate surrogate, only requires < 20 real-time simulations with 100-1000x lower simulation costs and significant speedup over existing sizing methods.

TEAM · MoDELS · 語言模型化 · Microsoft Surface · 多樣性 ·

2024 年 8 月 6 日

STAR: SocioTechnical Approach to Red Teaming Language Models

Laura Weidinger,John Mellor,Bernat Guillen Pegueroles,Nahema Marchal,Ravin Kumar,Kristian Lum,Canfer Akbulut,Mark Diaz,Stevie Bergman,Mikel Rodriguez,Verena Rieser,William Isaac

from arxiv, 8 pages, 5 figures, 5 pages appendix. * denotes equal contribution

This research introduces STAR, a sociotechnical framework that improves on current best practices for red teaming safety of large language models. STAR makes two key contributions: it enhances steerability by generating parameterised instructions for human red teamers, leading to improved coverage of the risk surface. Parameterised instructions also provide more detailed insights into model failures at no increased cost. Second, STAR improves signal quality by matching demographics to assess harms for specific groups, resulting in more sensitive annotations. STAR further employs a novel step of arbitration to leverage diverse viewpoints and improve label reliability, treating disagreement not as noise but as a valuable contribution to signal quality.