国产免费一区二区三区在线能观看_一级黄色视频一区_碰碰女人公开免费视频_2020年国产精品自线在拍_日韩免费无码人妻系列_女子高潮受不了流白浆视频_手机在线视频你懂的在线

Learning an effective representation of 3D point clouds requires a good metric to measure the discrepancy between two 3D point sets, which is non-trivial due to their irregularity. Most of the previous works resort to using the Chamfer discrepancy or Earth Mover's distance, but those metrics are either ineffective in measuring the differences between point clouds or computationally expensive. In this paper, we conduct a systematic study with extensive experiments on distance metrics for 3D point clouds. From this study, we propose to use sliced Wasserstein distance and its variants for learning representations of 3D point clouds. In addition, we introduce a new algorithm to estimate sliced Wasserstein distance that guarantees that the estimated value is close enough to the true one. Experiments show that the sliced Wasserstein distance and its variants allow the neural network to learn a more efficient representation compared to the Chamfer discrepancy. We demonstrate the efficiency of the sliced Wasserstein metric and its variants on several tasks in 3D computer vision including training a point cloud autoencoder, generative modeling, transfer learning, and point cloud registration.

相關內容

點云

關注 48

根(gen)據激(ji)(ji)光(guang)測量原(yuan)理(li)得(de)(de)到的(de)(de)點(dian)(dian)云，包(bao)括(kuo)三(san)(san)維坐(zuo)標(biao)（XYZ）和激(ji)(ji)光(guang)反射(she)強度(du)(du)（Intensity）。根(gen)據攝影(ying)測量原(yuan)理(li)得(de)(de)到的(de)(de)點(dian)(dian)云，包(bao)括(kuo)三(san)(san)維坐(zuo)標(biao)（XYZ）和顏色信(xin)息（RGB）。結合(he)激(ji)(ji)光(guang)測量和攝影(ying)測量原(yuan)理(li)得(de)(de)到點(dian)(dian)云，包(bao)括(kuo)三(san)(san)維坐(zuo)標(biao)（XYZ）、激(ji)(ji)光(guang)反射(she)強度(du)(du)（Intensity）和顏色信(xin)息（RGB）。在獲(huo)取物體表(biao)面每(mei)個采樣點(dian)(dian)的(de)(de)空間坐(zuo)標(biao)后，得(de)(de)到的(de)(de)是一個點(dian)(dian)的(de)(de)集合(he)，稱之為“點(dian)(dian)云”(Point Cloud)

Processing（編程語言） · 點云 · 3D · 知識表示 · DNN ·

2021 年 11 月 5 日

Interpreting Representation Quality of DNNs for 3D Point Cloud Processing

Wen Shen,Qihan Ren,Dongrui Liu,Quanshi Zhang

In this paper, we evaluate the quality of knowledge representations encoded in deep neural networks (DNNs) for 3D point cloud processing. We propose a method to disentangle the overall model vulnerability into the sensitivity to the rotation, the translation, the scale, and local 3D structures. Besides, we also propose metrics to evaluate the spatial smoothness of encoding 3D structures, and the representation complexity of the DNN. Based on such analysis, experiments expose representation problems with classic DNNs, and explain the utility of the adversarial training.

Extensibility · 多任務學習 · 學成 · 優化器 · Networking ·

2020 年 9 月 16 日

Multi-Task Learning for Dense Prediction Tasks: A Survey

Simon Vandenhende,Stamatios Georgoulis,Wouter Van Gansbeke,Marc Proesmans,Dengxin Dai,Luc Van Gool

from arxiv, Code is available: //github.com/SimonVandenhende/Multi-Task-Learning-PyTorch

With the advent of deep learning, many dense prediction tasks, i.e. tasks that produce pixel-level predictions, have seen significant performance improvements. The typical approach is to learn these tasks in isolation, that is, a separate neural network is trained for each individual task. Yet, recent multi-task learning (MTL) techniques have shown promising results w.r.t. performance, computations and/or memory footprint, by jointly tackling multiple tasks through a learned shared representation. In this survey, we provide a well-rounded view on state-of-the-art deep learning approaches for MTL in computer vision, explicitly emphasizing on dense prediction tasks. Our contributions concern the following. First, we consider MTL from a network architecture point-of-view. We include an extensive overview and discuss the advantages/disadvantages of recent popular MTL models. Second, we examine various optimization methods to tackle the joint learning of multiple tasks. We summarize the qualitative elements of these works and explore their commonalities and differences. Finally, we provide an extensive experimental evaluation across a variety of dense prediction benchmarks to examine the pros and cons of the different methods, including both architectural and optimization based strategies.

3D · 學成 · 卷積神經網絡 · Machine Learning · Medical Image Analysis ·

2020 年 4 月 1 日

3D Deep Learning on Medical Images: A Review

Satya P. Singh,Lipo Wang,Sukrit Gupta,Haveesh Goli,Parasuraman Padmanabhan,Balázs Gulyás

from arxiv, 13 pages, 4 figures, 2 tables, submitted to IEEE Access

The rapid advancements in machine learning, graphics processing technologies and availability of medical imaging data has led to a rapid increase in use of machine learning models in the medical domain. This was exacerbated by the rapid advancements in convolutional neural network (CNN) based architectures, which were adopted by the medical imaging community to assist clinicians in disease diagnosis. Since the grand success of AlexNet in 2012, CNNs have been increasingly used in medical image analysis to improve the efficiency of human clinicians. In recent years, three-dimensional (3D) CNNs have been employed for analysis of medical images. In this paper, we trace the history of how the 3D CNN was developed from its machine learning roots, brief mathematical description of 3D CNN and the preprocessing steps required for medical images before feeding them to 3D CNNs. We review the significant research in the field of 3D medical imaging analysis using 3D CNNs (and its variants) in different medical areas such as classification, segmentation, detection, and localization. We conclude by discussing the challenges associated with the use of 3D CNNs in the medical imaging domain (and the use of deep learning models, in general) and possible future trends in the field.

點云 · 學成 · state-of-the-art · 深度學習 · 3D ·

2020 年 1 月 17 日

Review: deep learning on 3D point clouds

Saifullahi Aminu Bello,Shangshu Yu,Cheng Wang

Point cloud is point sets defined in 3D metric space. Point cloud has become one of the most significant data format for 3D representation. Its gaining increased popularity as a result of increased availability of acquisition devices, such as LiDAR, as well as increased application in areas such as robotics, autonomous driving, augmented and virtual reality. Deep learning is now the most powerful tool for data processing in computer vision, becoming the most preferred technique for tasks such as classification, segmentation, and detection. While deep learning techniques are mainly applied to data with a structured grid, point cloud, on the other hand, is unstructured. The unstructuredness of point clouds makes use of deep learning for its processing directly very challenging. Earlier approaches overcome this challenge by preprocessing the point cloud into a structured grid format at the cost of increased computational cost or lost of depth information. Recently, however, many state-of-the-arts deep learning techniques that directly operate on point cloud are being developed. This paper contains a survey of the recent state-of-the-art deep learning techniques that mainly focused on point cloud data. We first briefly discussed the major challenges faced when using deep learning directly on point cloud, we also briefly discussed earlier approaches which overcome the challenges by preprocessing the point cloud into a structured grid. We then give the review of the various state-of-the-art deep learning approaches that directly process point cloud in its unstructured form. We introduced the popular 3D point cloud benchmark datasets. And we also further discussed the application of deep learning in popular 3D vision tasks including classification, segmentation and detection.

點云 · 學成 · 3D · 深度學習 · Vision ·

2019 年 12 月 27 日

Deep Learning for 3D Point Clouds: A Survey

Yulan Guo,Hanyun Wang,Qingyong Hu,Hao Liu,Li Liu,Mohammed Bennamoun

from arxiv, //github.com/QingyongHu/SoTA-Point-Cloud

Point cloud learning has lately attracted increasing attention due to its wide applications in many areas, such as computer vision, autonomous driving, and robotics. As a dominating technique in AI, deep learning has been successfully used to solve various 2D vision problems. However, deep learning on point clouds is still in its infancy due to the unique challenges faced by the processing of point clouds with deep neural networks. Recently, deep learning on point clouds has become even thriving, with numerous methods being proposed to address different problems in this area. To stimulate future research, this paper presents a comprehensive review of recent progress in deep learning methods for point clouds. It covers three major tasks, including 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation. It also presents comparative results on several publicly available datasets, together with insightful observations and inspiring future research directions.

Extensibility · 點云 · 隨機采樣 · 樣本 · state-of-the-art ·

2019 年 11 月 25 日

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

Qingyong Hu,Bo Yang,Linhai Xie,Stefano Rosa,Yulan Guo,Zhihua Wang,Niki Trigoni,Andrew Markham

from arxiv, Code and data are available at: //github.com/QingyongHu/RandLA-Net

We study the problem of efficient semantic segmentation for large-scale 3D point clouds. By relying on expensive sampling techniques or computationally heavy pre/post-processing steps, most existing approaches are only able to be trained and operate over small-scale point clouds. In this paper, we introduce RandLA-Net, an efficient and lightweight neural architecture to directly infer per-point semantics for large-scale point clouds. The key to our approach is to use random point sampling instead of more complex point selection approaches. Although remarkably computation and memory efficient, random sampling can discard key features by chance. To overcome this, we introduce a novel local feature aggregation module to progressively increase the receptive field for each 3D point, thereby effectively preserving geometric details. Extensive experiments show that our RandLA-Net can process 1 million points in a single pass with up to 200X faster than existing approaches. Moreover, our RandLA-Net clearly surpasses state-of-the-art approaches for semantic segmentation on two large-scale benchmarks Semantic3D and SemanticKITTI.

點云 · 圖卷積 · Performer · 生成式對抗網絡 · state-of-the-art ·

2019 年 5 月 16 日

3D Point Cloud Generative Adversarial Network Based on Tree Structured Graph Convolutions

Dong Wook Shu,Sung Woo Park,Junseok Kwon

from arxiv, 10 pages

In this paper, we propose a novel generative adversarial network (GAN) for 3D point clouds generation, which is called tree-GAN. To achieve state-of-the-art performance for multi-class 3D point cloud generation, a tree-structured graph convolution network (TreeGCN) is introduced as a generator for tree-GAN. Because TreeGCN performs graph convolutions within a tree, it can use ancestor information to boost the representation power for features. To evaluate GANs for 3D point clouds accurately, we develop a novel evaluation metric called Frechet point cloud distance (FPD). Experimental results demonstrate that the proposed tree-GAN outperforms state-of-the-art GANs in terms of both conventional metrics and FPD, and can generate point clouds for different semantic parts without prior knowledge.

PyTorch · 學成 · 圖 · 高數據吞吐量 · 表示學習 ·

2019 年 3 月 7 日

Fast Graph Representation Learning with PyTorch Geometric

Matthias Fey,Jan Eric Lenssen

We introduce PyTorch Geometric, a library for deep learning on irregularly structured input data such as graphs, point clouds and manifolds, built upon PyTorch. In addition to general graph data structures and processing methods, it contains a variety of recently published methods from the domains of relational learning and 3D data processing. PyTorch Geometric achieves high data throughput by leveraging sparse GPU acceleration, by providing dedicated CUDA kernels and by introducing efficient mini-batch handling for input examples of different size. In this work, we present the library in detail and perform a comprehensive comparative study of the implemented methods in homogeneous evaluation scenarios.

三維重建 · 3D · Networks · Networking · Neural Networks ·

2018 年 12 月 10 日

Occupancy Networks: Learning 3D Reconstruction in Function Space

Lars Mescheder,Michael Oechsle,Michael Niemeyer,Sebastian Nowozin,Andreas Geiger

With the advent of deep neural networks, learning-based approaches for 3D reconstruction have gained popularity. However, unlike for images, in 3D there is no canonical representation which is both computationally and memory efficient yet allows for representing high-resolution geometry of arbitrary topology. Many of the state-of-the-art learning-based 3D reconstruction approaches can hence only represent very coarse 3D geometry or are limited to a restricted domain. In this paper, we propose occupancy networks, a new representation for learning-based 3D reconstruction methods. Occupancy networks implicitly represent the 3D surface as the continuous decision boundary of a deep neural network classifier. In contrast to existing approaches, our representation encodes a description of the 3D output at infinite resolution without excessive memory footprint. We validate that our representation can efficiently encode 3D structure and can be inferred from various kinds of input. Our experiments demonstrate competitive results, both qualitatively and quantitatively, for the challenging tasks of 3D reconstruction from single images, noisy point clouds and coarse discrete voxel grids. We believe that occupancy networks will become a useful tool in a wide variety of learning-based 3D tasks.

子空間 · 圖像分割 · 稀疏 · 估計/估計量 · 優化器 ·

2018 年 4 月 6 日

Image Segmentation Using Subspace Representation and Sparse Decomposition

Shervin Minaee

from arxiv, PhD Dissertation, NYU, 2018

Image foreground extraction is a classical problem in image processing and vision, with a large range of applications. In this dissertation, we focus on the extraction of text and graphics in mixed-content images, and design novel approaches for various aspects of this problem. We first propose a sparse decomposition framework, which models the background by a subspace containing smooth basis vectors, and foreground as a sparse and connected component. We then formulate an optimization framework to solve this problem, by adding suitable regularizations to the cost function to promote the desired characteristics of each component. We present two techniques to solve the proposed optimization problem, one based on alternating direction method of multipliers (ADMM), and the other one based on robust regression. Promising results are obtained for screen content image segmentation using the proposed algorithm. We then propose a robust subspace learning algorithm for the representation of the background component using training images that could contain both background and foreground components, as well as noise. With the learnt subspace for the background, we can further improve the segmentation results, compared to using a fixed subspace. Lastly, we investigate a different class of signal/image decomposition problem, where only one signal component is active at each signal element. In this case, besides estimating each component, we need to find their supports, which can be specified by a binary mask. We propose a mixed-integer programming problem, that jointly estimates the two components and their supports through an alternating optimization scheme. We show the application of this algorithm on various problems, including image segmentation, video motion segmentation, and also separation of text from textured images.