一级a视频免费一区二区_国产亚洲一区二区三区在线_色偷偷噜噜噜亚洲男人的天堂_欧美一区二区成人久久片爱_欧美日韩高清在线不卡区_久在线精品一区二区三区_免费观看又黄又爽的网站

The safety of automated vehicles (AVs) relies on the representation of their environment. Consequently, state-of-the-art AVs employ potent sensor systems to achieve the best possible environment representation at all times. Although these high-performing systems achieve impressive results, they induce significant requirements for the processing capabilities of an AV's computational hardware components and their energy consumption. To enable a dynamic adaptation of such perception systems based on the situational perception requirements, we introduce a model-agnostic method for the scalable employment of single-frame object detection models using frame-dropping in tracking-by-detection systems. We evaluate our approach on the KITTI 3D Tracking Benchmark, showing that significant energy savings can be achieved at acceptable performance degradation, reaching up to 28% reduction of energy consumption at a performance decline of 6.6% in HOTA score.

相關內容

Performer

關注 10

語音識別 · 自動語音識別 · 無監督學習 · 無監督 · Learning ·

2023 年 9 月 25 日

Leveraging Data Collection and Unsupervised Learning for Code-switched Tunisian Arabic Automatic Speech Recognition

Ahmed Amine Ben Abdallah,Ata Kabboudi,Amir Kanoun,Salah Zaiem

from arxiv, 6 pages, submitted to ICASSP 2024

Crafting an effective Automatic Speech Recognition (ASR) solution for dialects demands innovative approaches that not only address the data scarcity issue but also navigate the intricacies of linguistic diversity. In this paper, we address the aforementioned ASR challenge, focusing on the Tunisian dialect. First, textual and audio data is collected and in some cases annotated. Second, we explore self-supervision, semi-supervision and few-shot code-switching approaches to push the state-of-the-art on different Tunisian test sets; covering different acoustic, linguistic and prosodic conditions. Finally, and given the absence of conventional spelling, we produce a human evaluation of our transcripts to avoid the noise coming from spelling inadequacies in our testing references. Our models, allowing to transcribe audio samples in a linguistic mix involving Tunisian Arabic, English and French, and all the data used during training and testing are released for public use and further improvements.

假正例率 · 假陽性 · 可理解性 · CRAFT · Analysis ·

2023 年 9 月 22 日

On Data Fabrication in Collaborative Vehicular Perception: Attacks and Countermeasures

Qingzhao Zhang,Shuowei Jin,Jiachen Sun,Xumiao Zhang,Ruiyang Zhu,Qi Alfred Chen,Z. Morley Mao

from arxiv, 20 pages, 24 figures, accepted by Usenix Security 2024

Collaborative perception, which greatly enhances the sensing capability of connected and autonomous vehicles (CAVs) by incorporating data from external resources, also brings forth potential security risks. CAVs' driving decisions rely on remote untrusted data, making them susceptible to attacks carried out by malicious participants in the collaborative perception system. However, security analysis and countermeasures for such threats are absent. To understand the impact of the vulnerability, we break the ground by proposing various real-time data fabrication attacks in which the attacker delivers crafted malicious data to victims in order to perturb their perception results, leading to hard brakes or increased collision risks. Our attacks demonstrate a high success rate of over 86\% on high-fidelity simulated scenarios and are realizable in real-world experiments. To mitigate the vulnerability, we present a systematic anomaly detection approach that enables benign vehicles to jointly reveal malicious fabrication. It detects 91.5% of attacks with a false positive rate of 3% in simulated scenarios and significantly mitigates attack impacts in real-world scenarios.

圖 · INFORMS · 機器人 · Pivotal（公司） · MoDELS ·

2023 年 9 月 21 日

SG-Bot: Object Rearrangement via Coarse-to-Fine Robotic Imagination on Scene Graphs

Guangyao Zhai,Xiaoni Cai,Dianye Huang,Yan Di,Fabian Manhardt,Federico Tombari,Nassir Navab,Benjamin Busam

from arxiv, 8 pages, 6 figures. A video is uploaded here: //youtu.be/cA8wdfofAG4

Object rearrangement is pivotal in robotic-environment interactions, representing a significant capability in embodied AI. In this paper, we present SG-Bot, a novel rearrangement framework that utilizes a coarse-to-fine scheme with a scene graph as the scene representation. Unlike previous methods that rely on either known goal priors or zero-shot large models, SG-Bot exemplifies lightweight, real-time, and user-controllable characteristics, seamlessly blending the consideration of commonsense knowledge with automatic generation capabilities. SG-Bot employs a three-fold procedure--observation, imagination, and execution--to adeptly address the task. Initially, objects are discerned and extracted from a cluttered scene during the observation. These objects are first coarsely organized and depicted within a scene graph, guided by either commonsense or user-defined criteria. Then, this scene graph subsequently informs a generative model, which forms a fine-grained goal scene considering the shape information from the initial scene and object semantics. Finally, for execution, the initial and envisioned goal scenes are matched to formulate robotic action policies. Experimental results demonstrate that SG-Bot outperforms competitors by a large margin.

統計量 · Processing（編程語言） · 無監督 · LIDAR · Performer ·

2023 年 9 月 21 日

Unsupervised Domain Adaptation for Self-Driving from Past Traversal Features

Travis Zhang,Katie Luo,Cheng Perng Phoo,Yurong You,Wei-Lun Chao,Bharath Hariharan,Mark Campbell,Kilian Q. Weinberger

The rapid development of 3D object detection systems for self-driving cars has significantly improved accuracy. However, these systems struggle to generalize across diverse driving environments, which can lead to safety-critical failures in detecting traffic participants. To address this, we propose a method that utilizes unlabeled repeated traversals of multiple locations to adapt object detectors to new driving environments. By incorporating statistics computed from repeated LiDAR scans, we guide the adaptation process effectively. Our approach enhances LiDAR-based detection models using spatial quantized historical features and introduces a lightweight regression head to leverage the statistics for feature regularization. Additionally, we leverage the statistics for a novel self-training process to stabilize the training. The framework is detector model-agnostic and experiments on real-world datasets demonstrate significant improvements, achieving up to a 20-point performance gain, especially in detecting pedestrians and distant objects. Code is available at //github.com/zhangtravis/Hist-DA.

CMS · 成比例 · 語音合成 · 掩碼 · 連結 ·

2023 年 9 月 21 日

The Impact of Silence on Speech Anti-Spoofing

Yuxiang Zhang,Zhuo Li,Jingze Lu,Hua Hua,Wenchao Wang,Pengyuan Zhang

from arxiv, 16 pages, 9 figures, 13 tables

The current speech anti-spoofing countermeasures (CMs) show excellent performance on specific datasets. However, removing the silence of test speech through Voice Activity Detection (VAD) can severely degrade performance. In this paper, the impact of silence on speech anti-spoofing is analyzed. First, the reasons for the impact are explored, including the proportion of silence duration and the content of silence. The proportion of silence duration in spoof speech generated by text-to-speech (TTS) algorithms is lower than that in bonafide speech. And the content of silence generated by different waveform generators varies compared to bonafide speech. Then the impact of silence on model prediction is explored. Even after retraining, the spoof speech generated by neural network based end-to-end TTS algorithms suffers a significant rise in error rates when the silence is removed. To demonstrate the reasons for the impact of silence on CMs, the attention distribution of a CM is visualized through class activation mapping (CAM). Furthermore, the implementation and analysis of the experiments masking silence or non-silence demonstrates the significance of the proportion of silence duration for detecting TTS and the importance of silence content for detecting voice conversion (VC). Based on the experimental results, improving the robustness of CMs against unknown spoofing attacks by masking silence is also proposed. Finally, the attacks on anti-spoofing CMs through concatenating silence, and the mitigation of VAD and silence attack through low-pass filtering are introduced.

SLAM · Performer · Integration · 置信度 · 稀疏 ·

2023 年 9 月 20 日

Depth Completion with Multiple Balanced Bases and Confidence for Dense Monocular SLAM

Weijian Xie,Guanyi Chu,Quanhao Qian,Yihao Yu,Hai Li,Danpeng Chen,Shangjin Zhai,Nan Wang,Hujun Bao,Guofeng Zhang

Dense SLAM based on monocular cameras does indeed have immense application value in the field of AR/VR, especially when it is performed on a mobile device. In this paper, we propose a novel method that integrates a light-weight depth completion network into a sparse SLAM system using a multi-basis depth representation, so that dense mapping can be performed online even on a mobile phone. Specifically, we present a specifically optimized multi-basis depth completion network, called BBC-Net, tailored to the characteristics of traditional sparse SLAM systems. BBC-Net can predict multiple balanced bases and a confidence map from a monocular image with sparse points generated by off-the-shelf keypoint-based SLAM systems. The final depth is a linear combination of predicted depth bases that can be optimized by tuning the corresponding weights. To seamlessly incorporate the weights into traditional SLAM optimization and ensure efficiency and robustness, we design a set of depth weight factors, which makes our network a versatile plug-in module, facilitating easy integration into various existing sparse SLAM systems and significantly enhancing global depth consistency through bundle adjustment. To verify the portability of our method, we integrate BBC-Net into two representative SLAM systems. The experimental results on various datasets show that the proposed method achieves better performance in monocular dense mapping than the state-of-the-art methods. We provide an online demo running on a mobile phone, which verifies the efficiency and mapping quality of the proposed method in real-world scenarios.

復合數據 · 可穿戴設備 · 求逆 · INFORMS · Integration ·

2021 年 6 月 2 日

IoT Solutions with Multi-Sensor Fusion and Signal-Image Encoding for Secure Data Transfer and Decision Making

Piyush K. Sharma,Mark Dennison,Adrienne Raglin

from arxiv, Advances in Mass Data Analysis of Images and Signals in Artificial Intelligence and Pattern Recognition 15th International Conference, MDA 2020 Amsterdam, The Netherlands, July 20-21, 2020. //www.ibai-publishing.org/html/proceedings_2020/pdf/proceedings_book_MDA-AI&PR_2020.pdf

Deployment of Internet of Things (IoT) devices and Data Fusion techniques have gained popularity in public and government domains. This usually requires capturing and consolidating data from multiple sources. As datasets do not necessarily originate from identical sensors, fused data typically results in a complex data problem. Because military is investigating how heterogeneous IoT devices can aid processes and tasks, we investigate a multi-sensor approach. Moreover, we propose a signal to image encoding approach to transform information (signal) to integrate (fuse) data from IoT wearable devices to an image which is invertible and easier to visualize supporting decision making. Furthermore, we investigate the challenge of enabling an intelligent identification and detection operation and demonstrate the feasibility of the proposed Deep Learning and Anomaly Detection models that can support future application that utilizes hand gesture data from wearable devices.

Vision · 模型評估 · 可約的 · 計算機視覺 · DNN ·

2020 年 3 月 24 日

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Abhinav Goel,Caleb Tung,Yung-Hsiang Lu,George K. Thiruvathukal

from arxiv, Accepted for publication at 2020 IEEE 6th World Forum on Internet of Things (WF-IoT), New Orleans, LA, USA 2020

Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.

Networking · Neural Networks · MoDELS · Performer · 模型性能 ·

2019 年 9 月 8 日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Yu Cheng,Duo Wang,Pan Zhou,Tao Zhang

from arxiv, Published in IEEE Signal Processing Magazine, arXiv version including some recent works

Deep convolutional neural networks (CNNs) have recently achieved great success in many visual recognition tasks. However, existing deep neural network models are computationally expensive and memory intensive, hindering their deployment in devices with low memory resources or in applications with strict latency requirements. Therefore, a natural thought is to perform model compression and acceleration in deep networks without significantly decreasing the model performance. During the past few years, tremendous progress has been made in this area. In this paper, we survey the recent advanced techniques for compacting and accelerating CNNs model developed. These techniques are roughly categorized into four schemes: parameter pruning and sharing, low-rank factorization, transferred/compact convolutional filters, and knowledge distillation. Methods of parameter pruning and sharing will be described at the beginning, after that the other techniques will be introduced. For each scheme, we provide insightful analysis regarding the performance, related applications, advantages, and drawbacks etc. Then we will go through a few very recent additional successful methods, for example, dynamic capacity networks and stochastic depths networks. After that, we survey the evaluation matrix, the main datasets used for evaluating the model performance and recent benchmarking efforts. Finally, we conclude this paper, discuss remaining challenges and possible directions on this topic.

知識表示 · Things · 推薦系統 · MoDELS · 邊 ·

2018 年 5 月 10 日

A Unified Knowledge Representation and Context-aware Recommender System in Internet of Things

Yinhao Li,Awa Alqahtani,Ellis Solaiman,Charith Perera,Prem Prakash Jayaraman,Boualem Benatallah,Rajiv Ranjan

Within the rapidly developing Internet of Things (IoT), numerous and diverse physical devices, Edge devices, Cloud infrastructure, and their quality of service requirements (QoS), need to be represented within a unified specification in order to enable rapid IoT application development, monitoring, and dynamic reconfiguration. But heterogeneities among different configuration knowledge representation models pose limitations for acquisition, discovery and curation of configuration knowledge for coordinated IoT applications. This paper proposes a unified data model to represent IoT resource configuration knowledge artifacts. It also proposes IoT-CANE (Context-Aware recommendatioN systEm) to facilitate incremental knowledge acquisition and declarative context driven knowledge recommendation.