在线直接观看免费的黄片视频_在线国产视频9999_国产三级精品三级在专区性色_一级毛片免费高清视频在线_色综合AV社区男人的天堂首页_最好看免费观看高清视频大全_亚洲色园色做影视综合网

Annotating 3D LiDAR point clouds for perception tasks including 3D object detection and LiDAR semantic segmentation is notoriously time-and-energy-consuming. To alleviate the burden from labeling, it is promising to perform large-scale pre-training and fine-tune the pre-trained backbone on different downstream datasets as well as tasks. In this paper, we propose SPOT, namely Scalable Pre-training via Occupancy prediction for learning Transferable 3D representations, and demonstrate its effectiveness on various public datasets with different downstream tasks under the label-efficiency setting. Our contributions are threefold: (1) Occupancy prediction is shown to be promising for learning general representations, which is demonstrated by extensive experiments on plenty of datasets and tasks. (2) SPOT uses beam re-sampling technique for point cloud augmentation and applies class-balancing strategies to overcome the domain gap brought by various LiDAR sensors and annotation strategies in different datasets. (3) Scalable pre-training is observed, that is, the downstream performance across all the experiments gets better with more pre-training data. We believe that our findings can facilitate understanding of LiDAR point clouds and pave the way for future exploration in LiDAR pre-training. Codes and models will be released.

相關內容

LIDAR

關注 1

Performer · 語言模型化 · MoDELS · 相似度 · 下游任務 ·

2023 年 11 月 9 日

FDAPT: Federated Domain-adaptive Pre-training for Language Models

Lekang Jiang,Filip Svoboda,Nicholas D. Lane

from arxiv, Accepted at International Workshop on Federated Learning in the Age of Foundation Models in Conjunction with NeurIPS 2023

Foundation models (FMs) have shown prominent success in a wide range of tasks. Their applicability to specific domain-task pairings relies on the availability of, both, high-quality data and significant computational resources. These challenges are not new to the field and, indeed, Federated Learning (FL) has been shown to be a promising solution in similar setups. This paper tackles the specific case of Domain-Adaptive Pre-Training (DAPT), a key step in the application of FMs. We conduct the first comprehensive empirical study to evaluate the performance of Federated Domain-Adaptive Pre-Training (FDAPT). We demonstrate that FDAPT can maintain competitive downstream task performance to the centralized baseline in both IID and non-IID situations. Finally, we propose a novel algorithm, Frozen Federated Domain-Adaptive Pre-Training (FFDAPT). FFDAPT improves the computational efficiency by 12.1% on average and exhibits similar downstream task performance to vanilla FDAPT, with general performance fluctuations remaining less than 1%.

Analysis · 情感分析 · 變換 · Learning · 可辨認的 ·

2023 年 11 月 8 日

Deep Learning Brasil at ABSAPT 2022: Portuguese Transformer Ensemble Approaches

Juliana Resplande Santanna Gomes,Eduardo Augusto Santos Garcia,Adalberto Ferreira Barbosa Junior,Ruan Chaves Rodrigues,Diogo Fernandes Costa Silva,Dyonnatan Ferreira Maia,Nádia Félix Felipe da Silva,Arlindo Rodrigues Galv?o Filho,Anderson da Silva Soares

from arxiv, 11 pages, 3 figures, In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2022), Online. CEUR. org

Aspect-based Sentiment Analysis (ABSA) is a task whose objective is to classify the individual sentiment polarity of all entities, called aspects, in a sentence. The task is composed of two subtasks: Aspect Term Extraction (ATE), identify all aspect terms in a sentence; and Sentiment Orientation Extraction (SOE), given a sentence and its aspect terms, the task is to determine the sentiment polarity of each aspect term (positive, negative or neutral). This article presents we present our participation in Aspect-Based Sentiment Analysis in Portuguese (ABSAPT) 2022 at IberLEF 2022. We submitted the best performing systems, achieving new state-of-the-art results on both subtasks.

LIDAR · 特化 · 估計/估計量 · Extensibility · Performer ·

2023 年 11 月 8 日

BroadBEV: Collaborative LiDAR-camera Fusion for Broad-sighted Bird's Eye View Map Construction

Minsu Kim,Giseop Kim,Kyong Hwan Jin,Sunwook Choi

A recent sensor fusion in a Bird's Eye View (BEV) space has shown its utility in various tasks such as 3D detection, map segmentation, etc. However, the approach struggles with inaccurate camera BEV estimation, and a perception of distant areas due to the sparsity of LiDAR points. In this paper, we propose a broad BEV fusion (BroadBEV) that addresses the problems with a spatial synchronization approach of cross-modality. Our strategy aims to enhance camera BEV estimation for a broad-sighted perception while simultaneously improving the completion of LiDAR's sparsity in the entire BEV space. Toward that end, we devise Point-scattering that scatters LiDAR BEV distribution to camera depth distribution. The method boosts the learning of depth estimation of the camera branch and induces accurate location of dense camera features in BEV space. For an effective BEV fusion between the spatially synchronized features, we suggest ColFusion that applies self-attention weights of LiDAR and camera BEV features to each other. Our extensive experiments demonstrate that BroadBEV provides a broad-sighted BEV perception with remarkable performance gains.

點云 · LIDAR · 掩碼 · Extensibility · INFORMS ·

2023 年 11 月 8 日

PRED: Pre-training via Semantic Rendering on LiDAR Point Clouds

Hao Yang,Haiyang Wang,Di Dai,Liwei Wang

Pre-training is crucial in 3D-related fields such as autonomous driving where point cloud annotation is costly and challenging. Many recent studies on point cloud pre-training, however, have overlooked the issue of incompleteness, where only a fraction of the points are captured by LiDAR, leading to ambiguity during the training phase. On the other hand, images offer more comprehensive information and richer semantics that can bolster point cloud encoders in addressing the incompleteness issue inherent in point clouds. Yet, incorporating images into point cloud pre-training presents its own challenges due to occlusions, potentially causing misalignments between points and pixels. In this work, we propose PRED, a novel image-assisted pre-training framework for outdoor point clouds in an occlusion-aware manner. The main ingredient of our framework is a Birds-Eye-View (BEV) feature map conditioned semantic rendering, leveraging the semantics of images for supervision through neural rendering. We further enhance our model's performance by incorporating point-wise masking with a high mask ratio (95%). Extensive experiments demonstrate PRED's superiority over prior point cloud pre-training methods, providing significant improvements on various large-scale datasets for 3D perception tasks. Codes will be available at //github.com/PRED4pc/PRED.

實體對齊 · entity · Learning · 標注 · 可約的 ·

2023 年 11 月 8 日

MixTEA: Semi-supervised Entity Alignment with Mixture Teaching

Feng Xie,Xin Song,Xiang Zeng,Xuechen Zhao,Lei Tian,Bin Zhou,Yusong Tan

from arxiv, Findings of EMNLP 2023; 11 pages, 4 figures; code see //github.com/Xiefeng69/MixTEA

Semi-supervised entity alignment (EA) is a practical and challenging task because of the lack of adequate labeled mappings as training data. Most works address this problem by generating pseudo mappings for unlabeled entities. However, they either suffer from the erroneous (noisy) pseudo mappings or largely ignore the uncertainty of pseudo mappings. In this paper, we propose a novel semi-supervised EA method, termed as MixTEA, which guides the model learning with an end-to-end mixture teaching of manually labeled mappings and probabilistic pseudo mappings. We firstly train a student model using few labeled mappings as standard. More importantly, in pseudo mapping learning, we propose a bi-directional voting (BDV) strategy that fuses the alignment decisions in different directions to estimate the uncertainty via the joint matching confidence score. Meanwhile, we also design a matching diversity-based rectification (MDR) module to adjust the pseudo mapping learning, thus reducing the negative influence of noisy mappings. Extensive results on benchmark datasets as well as further analyses demonstrate the superiority and the effectiveness of our proposed method.

Medical Image Analysis · Analysis · 數據增強 · 超參數 · Extensibility ·

2023 年 11 月 7 日

MedAugment: Universal Automatic Data Augmentation Plug-in for Medical Image Analysis

Zhaoshan Liu,Qiujie Lv,Yifan Li,Ziduo Yang,Lei Shen

from arxiv, 26 pages, 10 figures

Data augmentation (DA) has been widely leveraged in the realm of computer vision to alleviate the data shortage, whereas the DA in medical image analysis (MIA) faces multiple challenges. The prevalent DA approaches in MIA encompass conventional DA, synthetic DA, and automatic DA. However, the utilization of these approaches poses various challenges such as experience-driven design and intensive computation cost. Here, we propose an efficient and effective automatic DA method termed MedAugment. We propose the pixel augmentation space and spatial augmentation space and exclude the operations that can break the details and features within medical images. Besides, we propose a novel sampling strategy by sampling a limited number of operations from the two spaces. Moreover, we present a hyperparameter mapping relationship to produce a rational augmentation level and make the MedAugment fully controllable using a single hyperparameter. These revisions address the differences between natural and medical images. Extensive experimental results on four classification and three segmentation datasets demonstrate the superiority of MedAugment. We posit that the plug-and-use and training-free MedAugment holds the potential to make a valuable contribution to the medical field, particularly benefiting medical experts lacking foundational expertise in deep learning. Code is available at //github.com/NUS-Tim/MedAugment.

UniFormer · Networking · Neural Networks · 模型評估 · Weight ·

2023 年 11 月 7 日

MINT: Multiplier-less INTeger Quantization for Energy Efficient Spiking Neural Networks

Ruokai Yin,Yuhang Li,Abhishek Moitra,Priyadarshini Panda

from arxiv, 6 pages. Accepted to 29th Asia and South Pacific Design Automation Conference (ASP-DAC 2024), nominated for best paper award

We propose Multiplier-less INTeger (MINT) quantization, a uniform quantization scheme that efficiently compresses weights and membrane potentials in spiking neural networks (SNNs). Unlike previous SNN quantization methods, MINT quantizes memory-intensive membrane potentials to an extremely low precision (2-bit), significantly reducing the memory footprint. MINT also shares the quantization scaling factor between weights and membrane potentials, eliminating the need for multipliers required in conventional uniform quantization. Experimental results show that our method matches the accuracy of full-precision models and other state-of-the-art SNN quantization techniques while surpassing them in memory footprint reduction and hardware cost efficiency at deployment. For example, 2-bit MINT VGG-16 achieves 90.6% accuracy on CIFAR-10, with roughly 93.8% reduction in memory footprint from the full-precision model and 90% reduction in computation energy compared to vanilla uniform quantization at deployment. The code is available at //github.com/Intelligent-Computing-Lab-Yale/MINT-Quantization.

Color · 點云 · Networking · 圖 · 圖注意力網絡 ·

2023 年 11 月 7 日

GQE-Net: A Graph-based Quality Enhancement Network for Point Cloud Color Attribute

Jinrui Xing,Hui Yuan,Raouf Hamzaoui,Hao Liu,Junhui Hou

from arxiv, Accepted by IEEE TIP (DOI: 10.1109/TIP.2023.3330086)

In recent years, point clouds have become increasingly popular for representing three-dimensional (3D) visual objects and scenes. To efficiently store and transmit point clouds, compression methods have been developed, but they often result in a degradation of quality. To reduce color distortion in point clouds, we propose a graph-based quality enhancement network (GQE-Net) that uses geometry information as an auxiliary input and graph convolution blocks to extract local features efficiently. Specifically, we use a parallel-serial graph attention module with a multi-head graph attention mechanism to focus on important points or features and help them fuse together. Additionally, we design a feature refinement module that takes into account the normals and geometry distance between points. To work within the limitations of GPU memory capacity, the distorted point cloud is divided into overlap-allowed 3D patches, which are sent to GQE-Net for quality enhancement. To account for differences in data distribution among different color components, three models are trained for the three color components. Experimental results show that our method achieves state-of-the-art performance. For example, when implementing GQE-Net on a recent test model of the geometry-based point cloud compression (G-PCC) standard, 0.43 dB, 0.25 dB, and 0.36 dB Bjontegaard delta (BD)-peak-signal-to-noise ratio (PSNR), corresponding to 14.0%, 9.3%, and 14.5% BD-rate savings can be achieved on dense point clouds for the Y, Cb, and Cr components, respectively. The source code of our method is available at //github.com/xjr998/GQE-Net.

簇 · 無監督 · Re-ID · MoDELS · Extensibility ·

2021 年 12 月 3 日

Mind Your Clever Neighbours: Unsupervised Person Re-identification via Adaptive Clustering Relationship Modeling

Lianjie Jia,Chenyang Yu,Xiehao Ye,Tianyu Yan,Yinjie Lei,Pingping Zhang

from arxiv, This work has been accepted by AAAI-2022. Some modifications may be performed for the final version

Unsupervised person re-identification (Re-ID) attracts increasing attention due to its potential to resolve the scalability problem of supervised Re-ID models. Most existing unsupervised methods adopt an iterative clustering mechanism, where the network was trained based on pseudo labels generated by unsupervised clustering. However, clustering errors are inevitable. To generate high-quality pseudo-labels and mitigate the impact of clustering errors, we propose a novel clustering relationship modeling framework for unsupervised person Re-ID. Specifically, before clustering, the relation between unlabeled images is explored based on a graph correlation learning (GCL) module and the refined features are then used for clustering to generate high-quality pseudo-labels.Thus, GCL adaptively mines the relationship between samples in a mini-batch to reduce the impact of abnormal clustering when training. To train the network more effectively, we further propose a selective contrastive learning (SCL) method with a selective memory bank update policy. Extensive experiments demonstrate that our method shows much better results than most state-of-the-art unsupervised methods on Market1501, DukeMTMC-reID and MSMT17 datasets. We will release the code for model reproduction.

相關系數 · AUC · 變換 · 示例 · 學成 ·

2021 年 6 月 2 日

TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classication

Zhuchen Shao,Hao Bian,Yang Chen,Yifeng Wang,Jian Zhang,Xiangyang Ji,Yongbing Zhang

Multiple instance learning (MIL) is a powerful tool to solve the weakly supervised classification in whole slide image (WSI) based pathology diagnosis. However, the current MIL methods are usually based on independent and identical distribution hypothesis, thus neglect the correlation among different instances. To address this problem, we proposed a new framework, called correlated MIL, and provided a proof for convergence. Based on this framework, we devised a Transformer based MIL (TransMIL), which explored both morphological and spatial information. The proposed TransMIL can effectively deal with unbalanced/balanced and binary/multiple classification with great visualization and interpretability. We conducted various experiments for three different computational pathology problems and achieved better performance and faster convergence compared with state-of-the-art methods. The test AUC for the binary tumor classification can be up to 93.09% over CAMELYON16 dataset. And the AUC over the cancer subtypes classification can be up to 96.03% and 98.82% over TCGA-NSCLC dataset and TCGA-RCC dataset, respectively.