丰满人妻被公侵犯高清版_国产日韩精品全集在线观看_久久久久久九九极品久久_国内精品区二区三区四区_国产午夜福利片一区在线观看_99国产欧美精品久久久蜜臀_日本久久综合久久

Modern optical satellite sensors enable high-resolution stereo reconstruction from space. But the challenging imaging conditions when observing the Earth from space push stereo matching to its limits. In practice, the resulting digital surface models (DSMs) are fairly noisy and often do not attain the accuracy needed for high-resolution applications such as 3D city modeling. Arguably, stereo correspondence based on low-level image similarity is insufficient and should be complemented with a-priori knowledge about the expected surface geometry beyond basic local smoothness. To that end, we introduce ResDepth, a convolutional neural network that learns such an expressive geometric prior from example data. ResDepth refines an initial, raw stereo DSM while conditioning the refinement on the images. I.e., it acts as a smart, learned post-processing filter and can seamlessly complement any stereo matching pipeline. In a series of experiments, we find that the proposed method consistently improves stereo DSMs both quantitatively and qualitatively. We show that the prior encoded in the network weights captures meaningful geometric characteristics of urban design, which also generalize across different districts and even from one city to another. Moreover, we demonstrate that, by training on a variety of stereo pairs, ResDepth can acquire a sufficient degree of invariance against variations in imaging conditions and acquisition geometry.

相關內容

三維重(zhong)建

關注 1173

在計(ji)(ji)算(suan)(suan)機(ji)(ji)視覺(jue)中(zhong), 三(san)維(wei)(wei)(wei)重(zhong)建(jian)是(shi)指根據(ju)單(dan)視圖(tu)或者(zhe)多(duo)視圖(tu)的(de)(de)圖(tu)像(xiang)重(zhong)建(jian)三(san)維(wei)(wei)(wei)信息的(de)(de)過程(cheng)(cheng). 由于(yu)單(dan)視頻的(de)(de)信息不完(wan)全,因此三(san)維(wei)(wei)(wei)重(zhong)建(jian)需要(yao)利(li)(li)用(yong)經(jing)驗知識. 而(er)多(duo)視圖(tu)的(de)(de)三(san)維(wei)(wei)(wei)重(zhong)建(jian)(類似人(ren)的(de)(de)雙(shuang)目定(ding)位)相(xiang)對比(bi)較(jiao)容(rong)易, 其方(fang)(fang)法是(shi)先對攝(she)像(xiang)機(ji)(ji)進行(xing)標(biao)定(ding), 即計(ji)(ji)算(suan)(suan)出攝(she)像(xiang)機(ji)(ji)的(de)(de)圖(tu)象坐標(biao)系(xi)與世界坐標(biao)系(xi)的(de)(de)關(guan)系(xi).然后利(li)(li)用(yong)多(duo)個(ge)二維(wei)(wei)(wei)圖(tu)象中(zhong)的(de)(de)信息重(zhong)建(jian)出三(san)維(wei)(wei)(wei)信息。物(wu)(wu)體三(san)維(wei)(wei)(wei)重(zhong)建(jian)是(shi)計(ji)(ji)算(suan)(suan)機(ji)(ji)輔助幾(ji)(ji)何設計(ji)(ji)(CAGD)、計(ji)(ji)算(suan)(suan)機(ji)(ji)圖(tu)形(xing)學(CG)、計(ji)(ji)算(suan)(suan)機(ji)(ji)動(dong)畫、計(ji)(ji)算(suan)(suan)機(ji)(ji)視覺(jue)、醫學圖(tu)像(xiang)處理、科學計(ji)(ji)算(suan)(suan)和虛擬現實(shi)、數字媒體創作等領(ling)域的(de)(de)共性科學問題和核心技(ji)(ji)術(shu)。在計(ji)(ji)算(suan)(suan)機(ji)(ji)內生成物(wu)(wu)體三(san)維(wei)(wei)(wei)表(biao)示(shi)主要(yao)有兩類方(fang)(fang)法。一(yi)(yi)類是(shi)使(shi)用(yong)幾(ji)(ji)何建(jian)模軟(ruan)件通過人(ren)機(ji)(ji)交互生成人(ren)為(wei)(wei)控制下(xia)的(de)(de)物(wu)(wu)體三(san)維(wei)(wei)(wei)幾(ji)(ji)何模型(xing),另一(yi)(yi)類是(shi)通過一(yi)(yi)定(ding)的(de)(de)手(shou)段獲取(qu)真實(shi)物(wu)(wu)體的(de)(de)幾(ji)(ji)何形(xing)狀。前者(zhe)實(shi)現技(ji)(ji)術(shu)已經(jing)十分成熟,現有若干軟(ruan)件支持(chi),比(bi)如:3DMAX、Maya、AutoCAD、UG等等,它(ta)們(men)一(yi)(yi)般使(shi)用(yong)具有數學表(biao)達式的(de)(de)曲線(xian)曲面表(biao)示(shi)幾(ji)(ji)何形(xing)狀。后者(zhe)一(yi)(yi)般稱(cheng)為(wei)(wei)三(san)維(wei)(wei)(wei)重(zhong)建(jian)過程(cheng)(cheng),三(san)維(wei)(wei)(wei)重(zhong)建(jian)是(shi)指利(li)(li)用(yong)二維(wei)(wei)(wei)投影恢復(fu)物(wu)(wu)體三(san)維(wei)(wei)(wei)信息(形(xing)狀等)的(de)(de)數學過程(cheng)(cheng)和計(ji)(ji)算(suan)(suan)機(ji)(ji)技(ji)(ji)術(shu),包(bao)括數據(ju)獲取(qu)、預(yu)處理、點云拼接和特征分析(xi)等步驟。

塑造 · 3D · 估計/估計量 · state-of-the-art · contrastive ·

2022 年 1 月 13 日

Learning Semantic Abstraction of Shape via 3D Region of Interest

Haiyue Fang,Xiaogang Wang,Zheyuan Cai,Yahao Shi,Xun Sun,Shilin Wu,Bin Zhou

In this paper, we focus on the two tasks of 3D shape abstraction and semantic analysis. This is in contrast to current methods, which focus solely on either 3D shape abstraction or semantic analysis. In addition, previous methods have had difficulty producing instance-level semantic results, which has limited their application. We present a novel method for the joint estimation of a 3D shape abstraction and semantic analysis. Our approach first generates a number of 3D semantic candidate regions for a 3D shape; we then employ these candidates to directly predict the semantic categories and refine the parameters of the candidate regions simultaneously using a deep convolutional neural network. Finally, we design an algorithm to fuse the predicted results and obtain the final semantic abstraction, which is shown to be an improvement over a standard non maximum suppression. Experimental results demonstrate that our approach can produce state-of-the-art results. Moreover, we also find that our results can be easily applied to instance-level semantic part segmentation and shape matching.

點云 · 3D · INFORMS · Performer · Pyramid ·

2022 年 1 月 8 日

Real-time Rail Recognition Based on 3D Point Clouds

Xinyi Yu,Weiqi He,Xuecheng Qian,Yang Yang,Linlin Ou

Accurate rail location is a crucial part in the railway support driving system for safety monitoring. LiDAR can obtain point clouds that carry 3D information for the railway environment, especially in darkness and terrible weather conditions. In this paper, a real-time rail recognition method based on 3D point clouds is proposed to solve the challenges, such as disorderly, uneven density and large volume of the point clouds. A voxel down-sampling method is first presented for density balanced of railway point clouds, and pyramid partition is designed to divide the 3D scanning area into the voxels with different volumes. Then, a feature encoding module is developed to find the nearest neighbor points and to aggregate their local geometric features for the center point. Finally, a multi-scale neural network is proposed to generate the prediction results of each voxel and the rail location. The experiments are conducted under 9 sequences of 3D point cloud data for the railway. The results show that the method has good performance in detecting straight, curved and other complex topologies rails.

估計/估計量 · SCAN · Extensibility · 3D · 穩健性 ·

2021 年 1 月 17 日

MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization

Jiahui Huang,He Wang,Tolga Birdal,Minhyuk Sung,Federica Arrigoni,Shi-Min Hu,Leonidas Guibas

from arxiv, Contact: huang-jh18<at>mails<dot>tsinghua<dot>edu<dot>cn

We present MultiBodySync, a novel, end-to-end trainable multi-body motion segmentation and rigid registration framework for multiple input 3D point clouds. The two non-trivial challenges posed by this multi-scan multibody setting that we investigate are: (i) guaranteeing correspondence and segmentation consistency across multiple input point clouds capturing different spatial arrangements of bodies or body parts; and (ii) obtaining robust motion-based rigid body segmentation applicable to novel object categories. We propose an approach to address these issues that incorporates spectral synchronization into an iterative deep declarative network, so as to simultaneously recover consistent correspondences as well as motion segmentation. At the same time, by explicitly disentangling the correspondence and motion segmentation estimation modules, we achieve strong generalizability across different object categories. Our extensive evaluations demonstrate that our method is effective on various datasets ranging from rigid parts in articulated objects to individually moving objects in a 3D scene, be it single-view or full point clouds.

MoDELS · INTERACT · 圖 · 推斷 · Performer ·

2019 年 7 月 11 日

Learning by Abstraction: The Neural State Machine

Drew A. Hudson,Christopher D. Manning

We introduce the Neural State Machine, seeking to bridge the gap between the neural and symbolic views of AI and integrate their complementary strengths for the task of visual reasoning. Given an image, we first predict a probabilistic graph that represents its underlying semantics and serves as a structured world model. Then, we perform sequential reasoning over the graph, iteratively traversing its nodes to answer a given question or draw a new inference. In contrast to most neural architectures that are designed to closely interact with the raw sensory data, our model operates instead in an abstract latent space, by transforming both the visual and linguistic modalities into semantic concept-based representations, thereby achieving enhanced transparency and modularity. We evaluate our model on VQA-CP and GQA, two recent VQA datasets that involve compositionality, multi-step inference and diverse reasoning skills, achieving state-of-the-art results in both cases. We provide further experiments that illustrate the model's strong generalization capacity across multiple dimensions, including novel compositions of concepts, changes in the answer distribution, and unseen linguistic structures, demonstrating the qualities and efficacy of our approach.

3D · 學成 · 監督 · INFORMS · Performer ·

2018 年 7 月 24 日

Learning to Generate and Reconstruct 3D Meshes with only 2D Supervision

Paul Henderson,Vittorio Ferrari

from arxiv, BMVC 2018

We present a unified framework tackling two problems: class-specific 3D reconstruction from a single image, and generation of new 3D shape samples. These tasks have received considerable attention recently; however, existing approaches rely on 3D supervision, annotation of 2D images with keypoints or poses, and/or training with multiple views of each object instance. Our framework is very general: it can be trained in similar settings to these existing approaches, while also supporting weaker supervision scenarios. Importantly, it can be trained purely from 2D images, without ground-truth pose annotations, and with a single view per instance. We employ meshes as an output representation, instead of voxels used in most prior work. This allows us to exploit shading information during training, which previous 2D-supervised methods cannot. Thus, our method can learn to generate and reconstruct concave object classes. We evaluate our approach on synthetic data in various settings, showing that (i) it learns to disentangle shape from pose; (ii) using shading in the loss improves performance; (iii) our model is comparable or superior to state-of-the-art voxel-based approaches on quantitative metrics, while producing results that are visually more pleasing; (iv) it still performs well when given supervision weaker than in prior works.

全卷積網絡 · Networking · 卷積 · 圖像分割 · Neural Networks ·

2018 年 5 月 22 日

Knowledge-based Fully Convolutional Network and Its Application in Segmentation of Lung CT Images

Tao Yu,Yu Qiao,Huan Long

from arxiv, 9 pages, 5 figures

A variety of deep neural networks have been applied in medical image segmentation and achieve good performance. Unlike natural images, medical images of the same imaging modality are characterized by the same pattern, which indicates that same normal organs or tissues locate at similar positions in the images. Thus, in this paper we try to incorporate the prior knowledge of medical images into the structure of neural networks such that the prior knowledge can be utilized for accurate segmentation. Based on this idea, we propose a novel deep network called knowledge-based fully convolutional network (KFCN) for medical image segmentation. The segmentation function and corresponding error is analyzed. We show the existence of an asymptotically stable region for KFCN which traditional FCN doesn't possess. Experiments validate our knowledge assumption about the incorporation of prior knowledge into the convolution kernels of KFCN and show that KFCN can achieve a reasonable segmentation and a satisfactory accuracy.

FCN · 全卷積網絡 · 3D · 級聯 · MoDELS ·

2018 年 3 月 14 日

An application of cascaded 3D fully convolutional networks for medical image segmentation

Holger R. Roth,Hirohisa Oda,Xiangrong Zhou,Natsuki Shimizu,Ying Yang,Yuichiro Hayashi,Masahiro Oda,Michitaka Fujiwara,Kazunari Misawa,Kensaku Mori

from arxiv, Preprint accepted for publication in Computerized Medical Imaging and Graphics. arXiv admin note: substantial text overlap with arXiv:1704.06382

Recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of volumetric images. In this work, we show that a multi-class 3D FCN trained on manually labeled CT scans of several anatomical structures (ranging from the large organs to thin vessels) can achieve competitive segmentation results, while avoiding the need for handcrafting features or training class-specific models. To this end, we propose a two-stage, coarse-to-fine approach that will first use a 3D FCN to roughly define a candidate region, which will then be used as input to a second 3D FCN. This reduces the number of voxels the second FCN has to classify to ~10% and allows it to focus on more detailed segmentation of the organs and vessels. We utilize training and validation sets consisting of 331 clinical CT images and test our models on a completely unseen data collection acquired at a different hospital that includes 150 CT scans, targeting three anatomical organs (liver, spleen, and pancreas). In challenging organs such as the pancreas, our cascaded approach improves the mean Dice score from 68.5 to 82.2%, achieving the highest reported average score on this dataset. We compare with a 2D FCN method on a separate dataset of 240 CT scans with 18 classes and achieve a significantly higher performance in small organs and vessels. Furthermore, we explore fine-tuning our models to different datasets. Our experiments illustrate the promise and robustness of current 3D FCN based semantic segmentation of medical images, achieving state-of-the-art results. Our code and trained models are available for download: //github.com/holgerroth/3Dunet_abdomen_cascade.

可辨認的 · Networking · 卷積神經網絡 · Neural Networks · 層 ·

2018 年 1 月 25 日

Identifying Corresponding Patches in SAR and Optical Images with a Pseudo-Siamese CNN

Lloyd H. Hughes,Michael Schmitt,Lichao Mou,Yuanyuan Wang,Xiao Xiang Zhu

In this letter, we propose a pseudo-siamese convolutional neural network (CNN) architecture that enables to solve the task of identifying corresponding patches in very-high-resolution (VHR) optical and synthetic aperture radar (SAR) remote sensing imagery. Using eight convolutional layers each in two parallel network streams, a fully connected layer for the fusion of the features learned in each stream, and a loss function based on binary cross-entropy, we achieve a one-hot indication if two patches correspond or not. The network is trained and tested on an automatically generated dataset that is based on a deterministic alignment of SAR and optical imagery via previously reconstructed and subsequently co-registered 3D point clouds. The satellite images, from which the patches comprising our dataset are extracted, show a complex urban scene containing many elevated objects (i.e. buildings), thus providing one of the most difficult experimental environments. The achieved results show that the network is able to predict corresponding patches with high accuracy, thus indicating great potential for further development towards a generalized multi-sensor key-point matching procedure. Index Terms-synthetic aperture radar (SAR), optical imagery, data fusion, deep learning, convolutional neural networks (CNN), image matching, deep matching

正則的 · 圖像配準 · Performer · 3D · Extensibility ·

2018 年 1 月 23 日

3D Reconstruction in Canonical Co-ordinate Space from Arbitrarily Oriented 2D Images

Benjamin Hou,Bishesh Khanal,Amir Alansary,Steven McDonagh,Alice Davidson,Mary Rutherford,Jo V. Hajnal,Daniel Rueckert,Ben Glocker,Bernhard Kainz

Limited capture range, and the requirement to provide high quality initialization for optimization-based 2D/3D image registration methods, can significantly degrade the performance of 3D image reconstruction and motion compensation pipelines. Challenging clinical imaging scenarios, which contain significant subject motion such as fetal in-utero imaging, complicate the 3D image and volume reconstruction process. In this paper we present a learning based image registration method capable of predicting 3D rigid transformations of arbitrarily oriented 2D image slices, with respect to a learned canonical atlas co-ordinate system. Only image slice intensity information is used to perform registration and canonical alignment, no spatial transform initialization is required. To find image transformations we utilize a Convolutional Neural Network (CNN) architecture to learn the regression function capable of mapping 2D image slices to a 3D canonical atlas space. We extensively evaluate the effectiveness of our approach quantitatively on simulated Magnetic Resonance Imaging (MRI), fetal brain imagery with synthetic motion and further demonstrate qualitative results on real fetal MRI data where our method is integrated into a full reconstruction and motion compensation pipeline. Our learning based registration achieves an average spatial prediction error of 7 mm on simulated data and produces qualitatively improved reconstructions for heavily moving fetuses with gestational ages of approximately 20 weeks. Our model provides a general and computationally efficient solution to the 2D/3D registration initialization problem and is suitable for real-time scenarios.

磁流變材料 · 變分自編碼 · 學成 · 正則化項 · Neural Networks ·

2018 年 1 月 17 日

MR image reconstruction using deep density priors

Kerem C. Tezcan,Christian F. Baumgartner,Ender Konukoglu

from arxiv, 26 pages, 5 main figures, 7 supplementary figures, 1 table

Purpose: MR image reconstruction exploits regularization to compensate for missing k-space data. In this work, we propose to learn the probability distribution of MR image patches with neural networks and use this distribution as prior information constraining images during reconstruction, effectively employing it as regularization. Methods: We use variational autoencoders (VAE) to learn the distribution of MR image patches, which models the high-dimensional distribution by a latent parameter model of lower dimensions in a non-linear fashion. The proposed algorithm uses the learned prior in a Maximum-A-Posteriori estimation formulation. We evaluate the proposed reconstruction method with T1 weighted images and also apply our method on images with white matter lesions. Results: Visual evaluation of the samples showed that the VAE algorithm can approximate the distribution of MR patches well. The proposed reconstruction algorithm using the VAE prior produced high quality reconstructions. The algorithm achieved normalized RMSE, CNR and CN values of 2.77\%, 0.43, 0.11; 4.29\%, 0.43, 0.11, 6.36\%, 0.47, 0.11 and 10.00\%, 0.42, 0.10 for undersampling ratios of 2, 3, 4 and 5, respectively, where it outperformed most of the alternative methods. In the experiments on images with white matter lesions, the method faithfully reconstructed the lesions. Conclusion: We introduced a novel method for MR reconstruction, which takes a new perspective on regularization by using priors learned by neural networks. Results suggest the method compares favorably against the other evaluated methods and can reconstruct lesions as well. Keywords: Reconstruction, MRI, prior probability, MAP estimation, machine learning, variational inference, deep learning