顾美玲国产一区二区三区_人人婷婷色综合五月第四人色阁_欧美精九九99久久在免费线_伊人婷婷色香五月综合缴激情_毛成片1卡2卡3卡4卡_欧美日韩福利视频一区二区三区_中文免费不卡一区二区三区

Understanding terrain topology at long-range is crucial for the success of off-road robotic missions, especially when navigating at high-speeds. LiDAR sensors, which are currently heavily relied upon for geometric mapping, provide sparse measurements when mapping at greater distances. To address this challenge, we present a novel learning-based approach capable of predicting terrain elevation maps at long-range using only onboard egocentric images in real-time. Our proposed method is comprised of three main elements. First, a transformer-based encoder is introduced that learns cross-view associations between the egocentric views and prior bird-eye-view elevation map predictions. Second, an orientation-aware positional encoding is proposed to incorporate the 3D vehicle pose information over complex unstructured terrain with multi-view visual image features. Lastly, a history-augmented learn-able map embedding is proposed to achieve better temporal consistency between elevation map predictions to facilitate the downstream navigational tasks. We experimentally validate the applicability of our proposed approach for autonomous offroad robotic navigation in complex and unstructured terrain using real-world offroad driving data. Furthermore, the method is qualitatively and quantitatively compared against the current state-of-the-art methods. Extensive field experiments demonstrate that our method surpasses baseline models in accurately predicting terrain elevation while effectively capturing the overall terrain topology at long-ranges. Finally, ablation studies are conducted to highlight and understand the effect of key components of the proposed approach and validate their suitability to improve offroad robotic navigation capabilities.

相關內容

Elevate

關注 2

一款腦力訓練應用。

Networking · Analysis · Networks · 結點 · Performer ·

2024 年 6 月 4 日

A Practical Approach for Exploring Granger Connectivity in High-Dimensional Networks of Time Series

Sipan Aslan,Hernando Ombao

This manuscript presents a novel method for discovering effective connectivity between specified pairs of nodes in a high-dimensional network of time series. To accurately perform Granger causality analysis from the first node to the second node, it is essential to eliminate the influence of all other nodes within the network. The approach proposed is to create a low-dimensional representation of all other nodes in the network using frequency-domain-based dynamic principal component analysis (spectral DPCA). The resulting scores are subsequently removed from the first and second nodes of interest, thus eliminating the confounding effect of other nodes within the high-dimensional network. To conduct hypothesis testing on Granger causality, we propose a permutation-based causality test. This test enhances the accuracy of our findings when the error structures are non-Gaussian. The approach has been validated in extensive simulation studies, which demonstrate the efficacy of the methodology as a tool for causality analysis in complex time series networks. The proposed methodology has also been demonstrated to be both expedient and viable on real datasets, with particular success observed on multichannel EEG networks.

MoDELS · 穩健性 · Networking · Neural Networks · 可辨認的 ·

2024 年 6 月 3 日

From Feature Visualization to Visual Circuits: Effect of Adversarial Model Manipulation

Geraldin Nanfack,Michael Eickenberg,Eugene Belilovsky

from arxiv, Under review

Understanding the inner working functionality of large-scale deep neural networks is challenging yet crucial in several high-stakes applications. Mechanistic inter- pretability is an emergent field that tackles this challenge, often by identifying human-understandable subgraphs in deep neural networks known as circuits. In vision-pretrained models, these subgraphs are usually interpreted by visualizing their node features through a popular technique called feature visualization. Recent works have analyzed the stability of different feature visualization types under the adversarial model manipulation framework. This paper starts by addressing limitations in existing works by proposing a novel attack called ProxPulse that simultaneously manipulates the two types of feature visualizations. Surprisingly, when analyzing these attacks under the umbrella of visual circuits, we find that visual circuits show some robustness to ProxPulse. We, therefore, introduce a new attack based on ProxPulse that unveils the manipulability of visual circuits, shedding light on their lack of robustness. The effectiveness of these attacks is validated using pre-trained AlexNet and ResNet-50 models on ImageNet.

Performer · ForCES · 查準率/準確率 · anchor · 講稿 ·

2024 年 6 月 3 日

On the Existence of Static Equilibria of a Cable-Suspended Load with Non-stopping Flying Carriers

Chiara Gabellieri,Antonio Franchi

This work answers positively the question whether non-stop flights are possible for maintaining constant the pose of cable-suspended objects. Such a counterintuitive answer paves the way for a paradigm shift where energetically efficient fixed-wing flying carriers can replace the inefficient multirotor carriers that have been used so far in precise cooperative cable-suspended aerial manipulation. First, we show that one or two flying carriers alone cannot perform non-stop flights while maintaining a constant pose of the suspended object. Instead, we prove that three flying carriers can achieve this task provided that the orientation of the load at the equilibrium is such that the components of the cable forces that balance the external force (typically gravity) do not belong to the plane of the cable anchoring points on the load. Numerical tests are presented in support of the analytical results.

語音合成 · 自編碼器 · 變分自編碼 · Speech Com · Performer ·

2024 年 6 月 3 日

Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder

Jan Melechovsky,Ambuj Mehrish,Berrak Sisman,Dorien Herremans

from arxiv, preprint submitted to a conference, under review

Accent plays a significant role in speech communication, influencing one's capability to understand as well as conveying a person's identity. This paper introduces a novel and efficient framework for accented Text-to-Speech (TTS) synthesis based on a Conditional Variational Autoencoder. It has the ability to synthesize a selected speaker's voice, which is converted to any desired target accent. Our thorough experiments validate the effectiveness of the proposed framework using both objective and subjective evaluations. The results also show remarkable performance in terms of the ability to manipulate accents in the synthesized speech and provide a promising avenue for future accented TTS research.

泛化理論 · Performer · MoDELS · 目標領域 · Performance ·

2024 年 5 月 31 日

HD Maps are Lane Detection Generalizers: A Novel Generative Framework for Single-Source Domain Generalization

Daeun Lee,Minhyeok Heo,Jiwon Kim

from arxiv, Accepted by CVPR Data-Driven Autonomous Driving Simulation Workshop, 2024

Lane detection is a vital task for vehicles to navigate and localize their position on the road. To ensure reliable driving, lane detection models must have robust generalization performance in various road environments. However, despite the advanced performance in the trained domain, their generalization performance still falls short of expectations due to the domain discrepancy. To bridge this gap, we propose a novel generative framework using HD Maps for Single-Source Domain Generalization (SSDG) in lane detection. We first generate numerous front-view images from lane markings of HD Maps. Next, we strategically select a core subset among the generated images using (i) lane structure and (ii) road surrounding criteria to maximize their diversity. In the end, utilizing this core set, we train lane detection models to boost their generalization performance. We validate that our generative framework from HD Maps outperforms the Domain Adaptation model MLDA with +3.01%p accuracy improvement, even though we do not access the target domain images.

Agent · Learning · Better · AIM · CASES ·

2024 年 5 月 31 日

Paying to Do Better: Games with Payments between Learning Agents

Yoav Kolumbus,Joe Halpern,éva Tardos

In repeated games, such as auctions, players typically use learning algorithms to choose their actions. The use of such autonomous learning agents has become widespread on online platforms. In this paper, we explore the impact of players incorporating monetary transfers into their agents' algorithms, aiming to incentivize behavior in their favor. Our focus is on understanding when players have incentives to make use of monetary transfers, how these payments affect learning dynamics, and what the implications are for welfare and its distribution among the players. We propose a simple game-theoretic model to capture such scenarios. Our results on general games show that in a broad class of games, players benefit from letting their learning agents make payments to other learners during the game dynamics, and that in many cases, this kind of behavior improves welfare for all players. Our results on first- and second-price auctions show that in equilibria of the ``payment policy game,'' the agents' dynamics can reach strong collusive outcomes with low revenue for the auctioneer. These results highlight a challenge for mechanism design in systems where automated learning agents can benefit from interacting with their peers outside the boundaries of the mechanism.

語言模型化 · MoDELS · 級聯 · Better · 分解 ·

2024 年 5 月 30 日

SeamlessExpressiveLM: Speech Language Model for Expressive Speech-to-Speech Translation with Chain-of-Thought

Hongyu Gong,Bandhav Veluri

Expressive speech-to-speech translation (S2ST) is a key research topic in seamless communication, which focuses on the preservation of semantics and speaker vocal style in translated speech. Early works synthesized speaker style aligned speech in order to directly learn the mapping from speech to target speech spectrogram. Without reliance on style aligned data, recent studies leverage the advances of language modeling (LM) and build cascaded LMs on semantic and acoustic tokens. This work proposes SeamlessExpressiveLM, a single speech language model for expressive S2ST. We decompose the complex source-to-target speech mapping into intermediate generation steps with chain-of-thought prompting. The model is first guided to translate target semantic content and then transfer the speaker style to multi-stream acoustic units. Evaluated on Spanish-to-English and Hungarian-to-English translations, SeamlessExpressiveLM outperforms cascaded LMs in both semantic quality and style transfer, meanwhile achieving better parameter efficiency.

數據集 · GROUP · Elevate · 評論員 · 生物特征識別 ·

2022 年 11 月 3 日

Expanding Accurate Person Recognition to New Altitudes and Ranges: The BRIAR Dataset

David Cornett III,Joel Brogan,Nell Barber,Deniz Aykac,Seth Baird,Nick Burchfield,Carl Dukes,Andrew Duncan,Regina Ferrell,Jim Goddard,Gavin Jager,Matt Larson,Bart Murphy,Christi Johnson,Ian Shelley,Nisha Srinivas,Brandon Stockwell,Leanne Thompson,Matt Yohe,Robert Zhang,Scott Dolvin,Hector J. Santos-Villalobos,David S. Bolme

Face recognition technology has advanced significantly in recent years due largely to the availability of large and increasingly complex training datasets for use in deep learning models. These datasets, however, typically comprise images scraped from news sites or social media platforms and, therefore, have limited utility in more advanced security, forensics, and military applications. These applications require lower resolution, longer ranges, and elevated viewpoints. To meet these critical needs, we collected and curated the first and second subsets of a large multi-modal biometric dataset designed for use in the research and development (R&D) of biometric recognition technologies under extremely challenging conditions. Thus far, the dataset includes more than 350,000 still images and over 1,300 hours of video footage of approximately 1,000 subjects. To collect this data, we used Nikon DSLR cameras, a variety of commercial surveillance cameras, specialized long-rage R&D cameras, and Group 1 and Group 2 UAV platforms. The goal is to support the development of algorithms capable of accurately recognizing people at ranges up to 1,000 m and from high angles of elevation. These advances will include improvements to the state of the art in face recognition and will support new research in the area of whole-body recognition using methods based on gait and anthropometry. This paper describes methods used to collect and curate the dataset, and the dataset's characteristics at the current stage.

Machine Learning · 學成 · 可辨認的 · MoDELS · 可理解性 ·

2020 年 10 月 20 日

Counterfactual Explanations for Machine Learning: A Review

Sahil Verma,John Dickerson,Keegan Hines

from arxiv, 10 pages

Machine learning plays a role in many deployed decision systems, often in ways that are difficult or impossible to understand by human stakeholders. Explaining, in a human-understandable way, the relationship between the input and output of machine learning models is essential to the development of trustworthy machine-learning-based systems. A burgeoning body of research seeks to define the goals and methods of explainability in machine learning. In this paper, we seek to review and categorize research on counterfactual explanations, a specific class of explanation that provides a link between what could have happened had input to a model been changed in a particular way. Modern approaches to counterfactual explainability in machine learning draw connections to the established legal doctrine in many countries, making them appealing to fielded systems in high-impact areas such as finance and healthcare. Thus, we design a rubric with desirable properties of counterfactual explanation algorithms and comprehensively evaluate all currently-proposed algorithms against that rubric. Our rubric provides easy comparison and comprehension of the advantages and disadvantages of different approaches and serves as an introduction to major research themes in this field. We also identify gaps and discuss promising research directions in the space of counterfactual explainability.

采樣法 · 方差 · 圖形處理器 · INFORMS · 泛化理論 ·

2020 年 6 月 24 日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Weilin Cong,Rana Forsati,Mahmut Kandemir,Mehrdad Mahdavi

Sampling methods (e.g., node-wise, layer-wise, or subgraph) has become an indispensable strategy to speed up training large-scale Graph Neural Networks (GNNs). However, existing sampling methods are mostly based on the graph structural information and ignore the dynamicity of optimization, which leads to high variance in estimating the stochastic gradients. The high variance issue can be very pronounced in extremely large graphs, where it results in slow convergence and poor generalization. In this paper, we theoretically analyze the variance of sampling methods and show that, due to the composite structure of empirical risk, the variance of any sampling method can be decomposed into \textit{embedding approximation variance} in the forward stage and \textit{stochastic gradient variance} in the backward stage that necessities mitigating both types of variance to obtain faster convergence rate. We propose a decoupled variance reduction strategy that employs (approximate) gradient information to adaptively sample nodes with minimal variance, and explicitly reduces the variance introduced by embedding approximation. We show theoretically and empirically that the proposed method, even with smaller mini-batch sizes, enjoys a faster convergence rate and entails a better generalization compared to the existing methods.