亚州AV无码专区在线电影-亚洲日韩免费一二区

The availability of real-time semantics greatly improves the core geometric functionality of SLAM systems, enabling numerous robotic and AR/VR applications. We present a new methodology for real-time semantic mapping from RGB-D sequences that combines a 2D neural network and a 3D network based on a SLAM system with 3D occupancy mapping. When segmenting a new frame we perform latent feature re-projection from previous frames based on differentiable rendering. Fusing re-projected feature maps from previous frames with current-frame features greatly improves image segmentation quality, compared to a baseline that processes images independently. For 3D map processing, we propose a novel geometric quasi-planar over-segmentation method that groups 3D map elements likely to belong to the same semantic classes, relying on surface normals. We also describe a novel neural network design for lightweight semantic map post-processing. Our system achieves state-of-the-art semantic mapping quality within 2D-3D networks-based systems and matches the performance of 3D convolutional networks on three real indoor datasets, while working in real-time. Moreover, it shows better cross-sensor generalization abilities compared to 3D CNNs, enabling training and inference with different depth sensors. Code and data will be released on project page: //jingwenwang95.github.io/SeMLaPS

相關內容

Networking

關注 22

Networking：IFIP International Conferences on Networking。 Explanation：國際網絡會議。 Publisher：IFIP。 SIT：

Extensibility · 可理解性 · 泛化理論 · Facebook AI Research · Vision ·

2023 年 11 月 30 日

JPPF: Multi-task Fusion for Consistent Panoptic-Part Segmentation

Shishir Muralidhara,Sravan Kumar Jagadeesh,René Schuster,Didier Stricker

from arxiv, Accepted for Springer Nature Computer Science. arXiv admin note: substantial text overlap with arXiv:2212.07671

Part-aware panoptic segmentation is a problem of computer vision that aims to provide a semantic understanding of the scene at multiple levels of granularity. More precisely, semantic areas, object instances, and semantic parts are predicted simultaneously. In this paper, we present our Joint Panoptic Part Fusion (JPPF) that combines the three individual segmentations effectively to obtain a panoptic-part segmentation. Two aspects are of utmost importance for this: First, a unified model for the three problems is desired that allows for mutually improved and consistent representation learning. Second, balancing the combination so that it gives equal importance to all individual results during fusion. Our proposed JPPF is parameter-free and dynamically balances its input. The method is evaluated and compared on the Cityscapes Panoptic Parts (CPP) and Pascal Panoptic Parts (PPP) datasets in terms of PartPQ and Part-Whole Quality (PWQ). In extensive experiments, we verify the importance of our fair fusion, highlight its most significant impact for areas that can be further segmented into parts, and demonstrate the generalization capabilities of our design without fine-tuning on 5 additional datasets.

線性的 · Learning · JAX · 查準率/準確率 · 模型構建 ·

2023 年 11 月 29 日

CoLA: Exploiting Compositional Structure for Automatic and Efficient Numerical Linear Algebra

Andres Potapczynski,Marc Finzi,Geoff Pleiss,Andrew Gordon Wilson

from arxiv, Code available at //github.com/wilson-labs/cola. NeurIPS 2023

Many areas of machine learning and science involve large linear algebra problems, such as eigendecompositions, solving linear systems, computing matrix exponentials, and trace estimation. The matrices involved often have Kronecker, convolutional, block diagonal, sum, or product structure. In this paper, we propose a simple but general framework for large-scale linear algebra problems in machine learning, named CoLA (Compositional Linear Algebra). By combining a linear operator abstraction with compositional dispatch rules, CoLA automatically constructs memory and runtime efficient numerical algorithms. Moreover, CoLA provides memory efficient automatic differentiation, low precision computation, and GPU acceleration in both JAX and PyTorch, while also accommodating new objects, operations, and rules in downstream packages via multiple dispatch. CoLA can accelerate many algebraic operations, while making it easy to prototype matrix structures and algorithms, providing an appealing drop-in tool for virtually any computational effort that requires linear algebra. We showcase its efficacy across a broad range of applications, including partial differential equations, Gaussian processes, equivariant model construction, and unsupervised learning.

邊緣計算 · 邊 · Processing（編程語言） · Learning · 可約的 ·

2023 年 11 月 29 日

The AutoSPADA Platform: User-Friendly Edge Computing for Distributed Learning and Data Analytics in Connected Vehicles

Adrian Nilsson,Simon Smith,Jonas Hagmar,Magnus ?nnheim,Mats Jirstrand

from arxiv, 14 pages, 4 figures, 3 tables, 1 algorithm, 1 code listing

Contemporary connected vehicles host numerous applications, such as diagnostics and navigation, and new software is continuously being developed. However, the development process typically requires offline batch processing of large data volumes. In an edge computing approach, data analysts and developers can instead process sensor data directly on computational resources inside vehicles. This enables rapid prototyping to shorten development cycles and reduce the time to create new business values or insights. This paper presents the design, implementation, and operation of the AutoSPADA edge computing platform for distributed data analytics. The platform's design follows scalability, reliability, resource efficiency, privacy, and security principles promoted through mature and industrially proven technologies. In AutoSPADA, computational tasks are general Python scripts, and we provide a library to, for example, read signals from the vehicle and publish results to the cloud. Hence, users only need Python knowledge to use the platform. Moreover, the platform is designed to be extended to support additional programming languages.

噪聲 · 穩健性 · 白盒 · Learning · Networking ·

2023 年 11 月 29 日

Quantum Neural Networks under Depolarization Noise: Exploring White-Box Attacks and Defenses

David Winderl,Nicola Franco,Jeanette Miriam Lorenz

from arxiv, Poster at Quantum Techniques in Machine Learning (QTML) 2023

Leveraging the unique properties of quantum mechanics, Quantum Machine Learning (QML) promises computational breakthroughs and enriched perspectives where traditional systems reach their boundaries. However, similarly to classical machine learning, QML is not immune to adversarial attacks. Quantum adversarial machine learning has become instrumental in highlighting the weak points of QML models when faced with adversarial crafted feature vectors. Diving deep into this domain, our exploration shines light on the interplay between depolarization noise and adversarial robustness. While previous results enhanced robustness from adversarial threats through depolarization noise, our findings paint a different picture. Interestingly, adding depolarization noise discontinued the effect of providing further robustness for a multi-class classification scenario. Consolidating our findings, we conducted experiments with a multi-class classifier adversarially trained on gate-based quantum simulators, further elucidating this unexpected behavior.

通道 · Microsoft Surface · MoDELS · MIMO · Notability ·

2023 年 11 月 29 日

Holographic MIMO Communications with Arbitrary Surface Placements: Near-Field LoS Channel Model and Capacity Limit

Tierui Gong,Li Wei,Chongwen Huang,Zhijia Yang,Jiguang He,Mérouane Debbah,Chau Yuen

from arxiv, double column, 17 pages, 13 figures, accepted by IEEE Journal on Selected Areas in Communications

Envisioned as one of the most promising technologies, holographic multiple-input multiple-output (H-MIMO) recently attracts notable research interests for its great potential in expanding wireless possibilities and achieving fundamental wireless limits. Empowered by the nearly continuous, large and energy-efficient surfaces with powerful electromagnetic (EM) wave control capabilities, H-MIMO opens up the opportunity for signal processing in a more fundamental EM-domain, paving the way for realizing holographic imaging level communications in supporting the extremely high spectral efficiency and energy efficiency in future networks. In this article, we try to implement a generalized EM-domain near-field channel modeling and study its capacity limit of point-to-point H-MIMO systems that equips arbitrarily placed surfaces in a line-of-sight (LoS) environment. Two effective and computational-efficient channel models are established from their integral counterpart, where one is with a sophisticated formula but showcases more accurate, and another is concise with a slight precision sacrifice. Furthermore, we unveil the capacity limit using our channel model, and derive a tight upper bound based upon an elaborately built analytical framework. Our result reveals that the capacity limit grows logarithmically with the product of transmit element area, receive element area, and the combined effects of $1/{{d}_{mn}^2}$, $1/{{d}_{mn}^4}$, and $1/{{d}_{mn}^6}$ over all transmit and receive antenna elements, where $d_{mn}$ indicates the distance between each transmit and receive elements. Numerical evaluations validate the effectiveness of our channel models, and showcase the slight disparity between the upper bound and the exact capacity, which is beneficial for predicting practical system performance.

SLAM · RGB-D · Learning · Extensibility · INFORMS ·

2023 年 11 月 28 日

Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular, Stereo, and RGB-D Cameras

Huajian Huang,Longwei Li,Hui Cheng,Sai-Kit Yeung

The integration of neural rendering and the SLAM system recently showed promising results in joint localization and photorealistic view reconstruction. However, existing methods, fully relying on implicit representations, are so resource-hungry that they cannot run on portable devices, which deviates from the original intention of SLAM. In this paper, we present Photo-SLAM, a novel SLAM framework with a hyper primitives map. Specifically, we simultaneously exploit explicit geometric features for localization and learn implicit photometric features to represent the texture information of the observed environment. In addition to actively densifying hyper primitives based on geometric features, we further introduce a Gaussian-Pyramid-based training method to progressively learn multi-level features, enhancing photorealistic mapping performance. The extensive experiments with monocular, stereo, and RGB-D datasets prove that our proposed system Photo-SLAM significantly outperforms current state-of-the-art SLAM systems for online photorealistic mapping, e.g., PSNR is 30% higher and rendering speed is hundreds of times faster in the Replica dataset. Moreover, the Photo-SLAM can run at real-time speed using an embedded platform such as Jetson AGX Orin, showing the potential of robotics applications.

MoDELS · 優化器 · 穩健性 · 可約的 · 可理解性 ·

2023 年 11 月 26 日

Variational Exploration Module VEM: A Cloud-Native Optimization and Validation Tool for Geospatial Modeling and AI Workflows

Julian Kuehnert,Hiwot Tadesse,Chris Dearden,Rosie Lickorish,Paolo Fraccaro,Anne Jones,Blair Edwards,Sekou L. Remy,Peter Melling,Tim Culmer

from arxiv, Submitted to IAAI 2024: Deployed Innovative Tools for Enabling AI Applications

Geospatial observations combined with computational models have become key to understanding the physical systems of our environment and enable the design of best practices to reduce societal harm. Cloud-based deployments help to scale up these modeling and AI workflows. Yet, for practitioners to make robust conclusions, model tuning and testing is crucial, a resource intensive process which involves the variation of model input variables. We have developed the Variational Exploration Module which facilitates the optimization and validation of modeling workflows deployed in the cloud by orchestrating workflow executions and using Bayesian and machine learning-based methods to analyze model behavior. User configurations allow the combination of diverse sampling strategies in multi-agent environments. The flexibility and robustness of the model-agnostic module is demonstrated using real-world applications.

2022 年 9 月 21 日

Deep Learning for Medical Image Segmentation: Tricks, Challenges and Future Directions

Dong Zhang,Yi Lin,Hao Chen,Zhuotao Tian,Xin Yang,Jinhui Tang,Kwang Ting Cheng

from arxiv, Under consideration

Over the past few years, the rapid development of deep learning technologies for computer vision has greatly promoted the performance of medical image segmentation (MedISeg). However, the recent MedISeg publications usually focus on presentations of the major contributions (e.g., network architectures, training strategies, and loss functions) while unwittingly ignoring some marginal implementation details (also known as "tricks"), leading to a potential problem of the unfair experimental result comparisons. In this paper, we collect a series of MedISeg tricks for different model implementation phases (i.e., pre-training model, data pre-processing, data augmentation, model implementation, model inference, and result post-processing), and experimentally explore the effectiveness of these tricks on the consistent baseline models. Compared to paper-driven surveys that only blandly focus on the advantages and limitation analyses of segmentation models, our work provides a large number of solid experiments and is more technically operable. With the extensive experimental results on both the representative 2D and 3D medical image datasets, we explicitly clarify the effect of these tricks. Moreover, based on the surveyed tricks, we also open-sourced a strong MedISeg repository, where each of its components has the advantage of plug-and-play. We believe that this milestone work not only completes a comprehensive and complementary survey of the state-of-the-art MedISeg approaches, but also offers a practical guide for addressing the future medical image processing challenges including but not limited to small dataset learning, class imbalance learning, multi-modality learning, and domain adaptation. The code has been released at: //github.com/hust-linyi/MedISeg

監督 · Attention · Taxonomy · Learning · 有向 ·

2022 年 7 月 4 日

A Survey on Label-efficient Deep Segmentation: Bridging the Gap between Weak Supervision and Dense Prediction

Wei Shen,Zelin Peng,Xuehui Wang,Huayu Wang,Jiazhong Cen,Dongsheng Jiang,Lingxi Xie,Xiaokang Yang,Qi Tian

The rapid development of deep learning has made a great progress in segmentation, one of the fundamental tasks of computer vision. However, the current segmentation algorithms mostly rely on the availability of pixel-level annotations, which are often expensive, tedious, and laborious. To alleviate this burden, the past years have witnessed an increasing attention in building label-efficient, deep-learning-based segmentation algorithms. This paper offers a comprehensive review on label-efficient segmentation methods. To this end, we first develop a taxonomy to organize these methods according to the supervision provided by different types of weak labels (including no supervision, coarse supervision, incomplete supervision and noisy supervision) and supplemented by the types of segmentation problems (including semantic segmentation, instance segmentation and panoptic segmentation). Next, we summarize the existing label-efficient segmentation methods from a unified perspective that discusses an important question: how to bridge the gap between weak supervision and dense prediction -- the current methods are mostly based on heuristic priors, such as cross-pixel similarity, cross-label constraint, cross-view consistency, cross-image relation, etc. Finally, we share our opinions about the future research directions for label-efficient deep segmentation.

Learning · 圖 · Extensibility · motivation · 講稿 ·

2022 年 6 月 27 日

FederatedScope-GNN: Towards a Unified, Comprehensive and Efficient Package for Federated Graph Learning

Zhen Wang,Weirui Kuang,Yuexiang Xie,Liuyi Yao,Yaliang Li,Bolin Ding,Jingren Zhou

from arxiv, Accpeted by KDD'2022; We have released FederatedScope for users on //github.com/alibaba/FederatedScope

The incredible development of federated learning (FL) has benefited various tasks in the domains of computer vision and natural language processing, and the existing frameworks such as TFF and FATE has made the deployment easy in real-world applications. However, federated graph learning (FGL), even though graph data are prevalent, has not been well supported due to its unique characteristics and requirements. The lack of FGL-related framework increases the efforts for accomplishing reproducible research and deploying in real-world applications. Motivated by such strong demand, in this paper, we first discuss the challenges in creating an easy-to-use FGL package and accordingly present our implemented package FederatedScope-GNN (FS-G), which provides (1) a unified view for modularizing and expressing FGL algorithms; (2) comprehensive DataZoo and ModelZoo for out-of-the-box FGL capability; (3) an efficient model auto-tuning component; and (4) off-the-shelf privacy attack and defense abilities. We validate the effectiveness of FS-G by conducting extensive experiments, which simultaneously gains many valuable insights about FGL for the community. Moreover, we employ FS-G to serve the FGL application in real-world E-commerce scenarios, where the attained improvements indicate great potential business benefits. We publicly release FS-G, as submodules of FederatedScope, at //github.com/alibaba/FederatedScope to promote FGL's research and enable broad applications that would otherwise be infeasible due to the lack of a dedicated package.