高清一区二区三区视频在线观看,中文字幕无码乱人伦漫画

We consider the problem of cross-sensor domain adaptation in the context of LiDAR-based 3D object detection and propose Stationary Object Aggregation Pseudo-labelling (SOAP) to generate high quality pseudo-labels for stationary objects. In contrast to the current state-of-the-art in-domain practice of aggregating just a few input scans, SOAP aggregates entire sequences of point clouds at the input level to reduce the sensor domain gap. Then, by means of what we call quasi-stationary training and spatial consistency post-processing, the SOAP model generates accurate pseudo-labels for stationary objects, closing a minimum of 30.3% domain gap compared to few-frame detectors. Our results also show that state-of-the-art domain adaptation approaches can achieve even greater performance in combination with SOAP, in both the unsupervised and semi-supervised settings.

相關內容

偽標記

關注 0

層 · 通道 · 優化器 · INFORMS · Performance ·

2024 年 2 月 22 日

Semantic Communication-assisted Physical Layer Security over Fading Wiretap Channels

Xidong Mu,Yuanwei Liu

from arxiv, 6 pages, 3 figures, this paper is accepted by ICC 2024

A novel semantic communication (SC)-assisted secrecy transmission framework is proposed. In particular, the legitimate transmitter (Tx) sends the superimposed semantic and bit stream to the legitimate receiver (Rx), where the information may be eavesdropped by the malicious node (EVE). As the EVE merely has the conventional bit-oriented communication structure, the semantic signal acts as the type of beneficial information-bearing artificial noise (AN), which not only keeps strictly confidential to the EVE but also interferes with the EVE. The ergodic (equivalent) secrecy rate over fading wiretap channels is maximized by jointly optimizing the transmit power, semantic-bit power splitting ratio, and the successive interference cancellation decoding order at the Tx, subject to both the instantaneous peak and long-term average power constraints. To address this non-convex problem, both the optimal and suboptimal algorithms are developed by employing the Lagrangian dual method and the successive convex approximation method, respectively. Numerical results show that the proposed SC-assisted secrecy transmission scheme can significantly enhance the physical layer security compared to the baselines using the conventional bit-oriented communication and no-information-bearing AN. It also shows that the proposed suboptimal algorithm can achieve a near-optimal performance.

自動問答 · MoDELS · Better · Performance · 多樣性 ·

2024 年 2 月 22 日

Silver Retriever: Advancing Neural Passage Retrieval for Polish Question Answering

Piotr Rybak,Maciej Ogrodniczuk

Modern open-domain question answering systems often rely on accurate and efficient retrieval components to find passages containing the facts necessary to answer the question. Recently, neural retrievers have gained popularity over lexical alternatives due to their superior performance. However, most of the work concerns popular languages such as English or Chinese. For others, such as Polish, few models are available. In this work, we present Silver Retriever, a neural retriever for Polish trained on a diverse collection of manually or weakly labeled datasets. Silver Retriever achieves much better results than other Polish models and is competitive with larger multilingual models. Together with the model, we open-source five new passage retrieval datasets.

Networking · Attention · 約束 · Re-ID · 泛化理論 ·

2024 年 2 月 22 日

Attention Disturbance and Dual-Path Constraint Network for Occluded Person Re-identification

Jiaer Xia,Lei Tan,Pingyang Dai,Mingbo Zhao,Yongjian Wu,Liujuan Cao

from arxiv, AAAI2024

Occluded person re-identification (Re-ID) aims to address the potential occlusion problem when matching occluded or holistic pedestrians from different camera views. Many methods use the background as artificial occlusion and rely on attention networks to exclude noisy interference. However, the significant discrepancy between simple background occlusion and realistic occlusion can negatively impact the generalization of the network. To address this issue, we propose a novel transformer-based Attention Disturbance and Dual-Path Constraint Network (ADP) to enhance the generalization of attention networks. Firstly, to imitate real-world obstacles, we introduce an Attention Disturbance Mask (ADM) module that generates an offensive noise, which can distract attention like a realistic occluder, as a more complex form of occlusion. Secondly, to fully exploit these complex occluded images, we develop a Dual-Path Constraint Module (DPC) that can obtain preferable supervision information from holistic images through dual-path interaction. With our proposed method, the network can effectively circumvent a wide variety of occlusions using the basic ViT baseline. Comprehensive experimental evaluations conducted on person re-ID benchmarks demonstrate the superiority of ADP over state-of-the-art methods.

LORA · Agent · INTERACT · Performer · 任務對話系統 ·

2024 年 2 月 21 日

Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent

Xiaoyan Yu,Tongxu Luo,Yifan Wei,Fangyu Lei,Yiming Huang,Peng Hao,Liehuang Zhu

Large Language Models (LLMs) have revolutionized open-domain dialogue agents but encounter challenges in multi-character role-playing (MCRP) scenarios. To address the issue, we present Neeko, an innovative framework designed for efficient multiple characters imitation. Unlike existing methods, Neeko employs a dynamic low-rank adapter (LoRA) strategy, enabling it to adapt seamlessly to diverse characters. Our framework breaks down the role-playing process into agent pre-training, multiple characters playing, and character incremental learning, effectively handling both seen and unseen roles. This dynamic approach, coupled with distinct LoRA blocks for each character, enhances Neeko's adaptability to unique attributes, personalities, and speaking patterns. As a result, Neeko demonstrates superior performance in MCRP over most existing methods, offering more engaging and versatile user interaction experiences. Code and data are available at //github.com/weiyifan1023/Neeko.

Networking · 剪枝 · INFORMS · Performer · 穩健性 ·

2024 年 2 月 20 日

SWAP: Sparse Entropic Wasserstein Regression for Robust Network Pruning

Lei You,Hei Victor Cheng

from arxiv, Published as a conference paper at ICLR 2024

This study addresses the challenge of inaccurate gradients in computing the empirical Fisher Information Matrix during neural network pruning. We introduce SWAP, a formulation of Entropic Wasserstein regression (EWR) for pruning, capitalizing on the geometric properties of the optimal transport problem. The ``swap'' of the commonly used linear regression with the EWR in optimization is analytically demonstrated to offer noise mitigation effects by incorporating neighborhood interpolation across data points with only marginal additional computational cost. The unique strength of SWAP is its intrinsic ability to balance noise reduction and covariance information preservation effectively. Extensive experiments performed on various networks and datasets show comparable performance of SWAP with state-of-the-art (SoTA) network pruning algorithms. Our proposed method outperforms the SoTA when the network size or the target sparsity is large, the gain is even larger with the existence of noisy gradients, possibly from noisy data, analog memory, or adversarial attacks. Notably, our proposed method achieves a gain of 6% improvement in accuracy and 8% improvement in testing loss for MobileNetV1 with less than one-fourth of the network parameters remaining.

設計 · 可辨認的 · 評論員 · motivation · 分解的 ·

2024 年 2 月 20 日

User Feedback-Informed Interface Design for Flow Management Data and Services (FMDS)

Sinan Abdulhak,Anthony Carvette,Kate Shen,Robert Goldman,Bill Tuck,Max Z. Li

from arxiv, 8 pages, 8 figures

The transition to a microservices-based Flow Management Data and Services (FMDS) architecture from the existing Traffic Flow Management System (TFMS) is a critical enabler of the vision for an Information-Centric National Airspace System (NAS). The need to design a user-centric interface for FMDS is a key technical gap, as this interface connects NAS data and services to the traffic management specialists within all stakeholder groups (e.g., FAA, airlines). We provide a research-driven approach towards designing such a graphical user interface (GUI) for FMDS. Major goals include unifying the more than 50 disparate traffic management services currently hosted on TFMS, as well as streamlining the process of evaluating, modeling, and monitoring Traffic Management Initiatives (TMIs). Motivated by this, we iteratively designed a GUI leveraging human factors engineering and user experience design principles, as well as user interviews. Through user testing and interviews, we identify workflow benefits of our GUI (e.g., reduction in task completion time), along with next steps for developing a live prototype.

MoDELS · Processing（編程語言） · Vision · Continuity · HTTPS ·

2023 年 2 月 20 日

Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey

Xiao Wang,Guangyao Chen,Guangwu Qian,Pengcheng Gao,Xiao-Yong Wei,Yaowei Wang,Yonghong Tian,Wen Gao

from arxiv, Accepted by Machine Intelligence Research

With the urgent demand for generalized deep models, many pre-trained big models are proposed, such as BERT, ViT, GPT, etc. Inspired by the success of these models in single domains (like computer vision and natural language processing), the multi-modal pre-trained big models have also drawn more and more attention in recent years. In this work, we give a comprehensive survey of these models and hope this paper could provide new insights and helps fresh researchers to track the most cutting-edge works. Specifically, we firstly introduce the background of multi-modal pre-training by reviewing the conventional deep learning, pre-training works in natural language process, computer vision, and speech. Then, we introduce the task definition, key challenges, and advantages of multi-modal pre-training models (MM-PTMs), and discuss the MM-PTMs with a focus on data, objectives, network architectures, and knowledge enhanced pre-training. After that, we introduce the downstream tasks used for the validation of large-scale MM-PTMs, including generative, classification, and regression tasks. We also give visualization and analysis of the model parameters and results on representative downstream tasks. Finally, we point out possible research directions for this topic that may benefit future works. In addition, we maintain a continuously updated paper list for large-scale pre-trained multi-modal big models: //github.com/wangxiao5791509/MultiModal_BigModels_Survey

contrastive · Extensibility · 圖形處理器 · 學成 · Networking ·

2021 年 5 月 19 日

Self-supervised Heterogeneous Graph Neural Network with Co-contrastive Learning

Xiao Wang,Nian Liu,Hui Han,Chuan Shi

from arxiv, This paper has been accepted by KDD 2021

Heterogeneous graph neural networks (HGNNs) as an emerging technique have shown superior capacity of dealing with heterogeneous information network (HIN). However, most HGNNs follow a semi-supervised learning manner, which notably limits their wide use in reality since labels are usually scarce in real applications. Recently, contrastive learning, a self-supervised method, becomes one of the most exciting learning paradigms and shows great potential when there are no labels. In this paper, we study the problem of self-supervised HGNNs and propose a novel co-contrastive learning mechanism for HGNNs, named HeCo. Different from traditional contrastive learning which only focuses on contrasting positive and negative samples, HeCo employs cross-viewcontrastive mechanism. Specifically, two views of a HIN (network schema and meta-path views) are proposed to learn node embeddings, so as to capture both of local and high-order structures simultaneously. Then the cross-view contrastive learning, as well as a view mask mechanism, is proposed, which is able to extract the positive and negative embeddings from two views. This enables the two views to collaboratively supervise each other and finally learn high-level node embeddings. Moreover, two extensions of HeCo are designed to generate harder negative samples with high quality, which further boosts the performance of HeCo. Extensive experiments conducted on a variety of real-world networks show the superior performance of the proposed methods over the state-of-the-arts.

圖形處理器 · 圖 · Neural Networks · Networking · entity ·

2019 年 11 月 6 日

Multi-Paragraph Reasoning with Knowledge-enhanced Graph Neural Network

Deming Ye,Yankai Lin,Zhenghao Liu,Zhiyuan Liu,Maosong Sun

Multi-paragraph reasoning is indispensable for open-domain question answering (OpenQA), which receives less attention in the current OpenQA systems. In this work, we propose a knowledge-enhanced graph neural network (KGNN), which performs reasoning over multiple paragraphs with entities. To explicitly capture the entities' relatedness, KGNN utilizes relational facts in knowledge graph to build the entity graph. The experimental results show that KGNN outperforms in both distractor and full wiki settings than baselines methods on HotpotQA dataset. And our further analysis illustrates KGNN is effective and robust with more retrieved paragraphs.

BERT · 語言表示 · state-of-the-art · 可理解性 · 自動問答 ·

2018 年 10 月 11 日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin,Ming-Wei Chang,Kenton Lee,Kristina Toutanova

from arxiv, 13 pages

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT representations can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE benchmark to 80.4% (7.6% absolute improvement), MultiNLI accuracy to 86.7 (5.6% absolute improvement) and the SQuAD v1.1 question answering Test F1 to 93.2 (1.5% absolute improvement), outperforming human performance by 2.0%.