嘘嘘中国免费观看网站_成人午夜性影院视频_国产成人三级经典中文_高清黄色网址在线观看_大色网激情你懂的_日韩专砖一区二区三区四区_国产线观看免费观看

This paper proposes a novel visual simultaneous localization and mapping (SLAM) system called Hybrid Depth-augmented Panoramic Visual SLAM (HDPV-SLAM), that employs a panoramic camera and a tilted multi-beam LiDAR scanner to generate accurate and metrically-scaled trajectories. RGB-D SLAM was the design basis for HDPV-SLAM, which added depth information to visual features. It aims to solve the two major issues hindering the performance of similar SLAM systems. The first obstacle is the sparseness of LiDAR depth, which makes it difficult to correlate it with the extracted visual features of the RGB image. A deep learning-based depth estimation module for iteratively densifying sparse LiDAR depth was suggested to address this issue. The second issue pertains to the difficulties in depth association caused by a lack of horizontal overlap between the panoramic camera and the tilted LiDAR sensor. To surmount this difficulty, we present a hybrid depth association module that optimally combines depth information estimated by two independent procedures, feature-based triangulation and depth estimation. During a phase of feature tracking, this hybrid depth association module aims to maximize the use of more accurate depth information between the triangulated depth with visual features tracked and the deep learning-based corrected depth. We evaluated the efficacy of HDPV-SLAM using the 18.95 km-long York University and Teledyne Optech (YUTO) MMS dataset. The experimental results demonstrate that the two proposed modules contribute substantially to the performance of HDPV-SLAM, which surpasses that of the state-of-the-art (SOTA) SLAM systems.

相關內容

LIDAR

關注 1

Performer · 核化 · Learning · Processing（編程語言） · 主動學習 ·

2023 年 8 月 15 日

Graph-Structured Kernel Design for Power Flow Learning using Gaussian Processes

Parikshit Pareek,Deepjyoti Deka,Sidhant Misra

from arxiv, 10 pages

This paper presents a physics-inspired graph-structured kernel designed for power flow learning using Gaussian Process (GP). The kernel, named the vertex-degree kernel (VDK), relies on latent decomposition of voltage-injection relationship based on the network graph or topology. Notably, VDK design avoids the need to solve optimization problems for kernel search. To enhance efficiency, we also explore a graph-reduction approach to obtain a VDK representation with lesser terms. Additionally, we propose a novel network-swipe active learning scheme, which intelligently selects sequential training inputs to accelerate the learning of VDK. Leveraging the additive structure of VDK, the active learning algorithm performs a block-descent type procedure on GP's predictive variance, serving as a proxy for information gain. Simulations demonstrate that the proposed VDK-GP achieves more than two fold sample complexity reduction, compared to full GP on medium scale 500-Bus and large scale 1354-Bus power systems. The network-swipe algorithm outperforms mean performance of 500 random trials on test predictions by two fold for medium-sized 500-Bus systems and best performance of 25 random trials for large-scale 1354-Bus systems by 10%. Moreover, we demonstrate that the proposed method's performance for uncertainty quantification applications with distributionally shifted testing data sets.

吸引點 · Networking · 端到端 · INFORMS · Processing（編程語言） ·

2023 年 8 月 15 日

Attention-based Encoder-Decoder Network for End-to-End Neural Speaker Diarization with Target Speaker Attractor

Zhengyang Chen,Bing Han,Shuai Wang,Yanmin Qian

from arxiv, Accepted by InterSpeech 2023

This paper proposes a novel Attention-based Encoder-Decoder network for End-to-End Neural speaker Diarization (AED-EEND). In AED-EEND system, we incorporate the target speaker enrollment information used in target speaker voice activity detection (TS-VAD) to calculate the attractor, which can mitigate the speaker permutation problem and facilitate easier model convergence. In the training process, we propose a teacher-forcing strategy to obtain the enrollment information using the ground-truth label. Furthermore, we propose three heuristic decoding methods to identify the enrollment area for each speaker during the evaluation process. Additionally, we enhance the attractor calculation network LSTM used in the end-to-end encoder-decoder based attractor calculation (EEND-EDA) system by incorporating an attention-based model. By utilizing such an attention-based attractor decoder, our proposed AED-EEND system outperforms both the EEND-EDA and TS-VAD systems with only 0.5s of enrollment data.

Microsoft Surface · 級聯 · 帶符號距離 · 代價 · 逼真度 ·

2023 年 8 月 14 日

C2F2NeUS: Cascade Cost Frustum Fusion for High Fidelity and Generalizable Neural Surface Reconstruction

Luoyuan Xu,Tao Guan,Yuesong Wang,Wenkai Liu,Zhaojie Zeng,Junle Wang,Wei Yang

from arxiv, Accepted by ICCV2023

There is an emerging effort to combine the two popular 3D frameworks using Multi-View Stereo (MVS) and Neural Implicit Surfaces (NIS) with a specific focus on the few-shot / sparse view setting. In this paper, we introduce a novel integration scheme that combines the multi-view stereo with neural signed distance function representations, which potentially overcomes the limitations of both methods. MVS uses per-view depth estimation and cross-view fusion to generate accurate surfaces, while NIS relies on a common coordinate volume. Based on this strategy, we propose to construct per-view cost frustum for finer geometry estimation, and then fuse cross-view frustums and estimate the implicit signed distance functions to tackle artifacts that are due to noise and holes in the produced surface reconstruction. We further apply a cascade frustum fusion strategy to effectively captures global-local information and structural consistency. Finally, we apply cascade sampling and a pseudo-geometric loss to foster stronger integration between the two architectures. Extensive experiments demonstrate that our method reconstructs robust surfaces and outperforms existing state-of-the-art methods.

變換 · Performer · Weight · 卷積 · Swin Transformer ·

2023 年 8 月 14 日

SCSC: Spatial Cross-scale Convolution Module to Strengthen both CNNs and Transformers

Xijun Wang,Xiaojie Chu,Chunrui Han,Xiangyu Zhang

from arxiv, ICCV2023 Workshop (New Ideas in Vision Transformers)

This paper presents a module, Spatial Cross-scale Convolution (SCSC), which is verified to be effective in improving both CNNs and Transformers. Nowadays, CNNs and Transformers have been successful in a variety of tasks. Especially for Transformers, increasing works achieve state-of-the-art performance in the computer vision community. Therefore, researchers start to explore the mechanism of those architectures. Large receptive fields, sparse connections, weight sharing, and dynamic weight have been considered keys to designing effective base models. However, there are still some issues to be addressed: large dense kernels and self-attention are inefficient, and large receptive fields make it hard to capture local features. Inspired by the above analyses and to solve the mentioned problems, in this paper, we design a general module taking in these design keys to enhance both CNNs and Transformers. SCSC introduces an efficient spatial cross-scale encoder and spatial embed module to capture assorted features in one layer. On the face recognition task, FaceResNet with SCSC can improve 2.7% with 68% fewer FLOPs and 79% fewer parameters. On the ImageNet classification task, Swin Transformer with SCSC can achieve even better performance with 22% fewer FLOPs, and ResNet with CSCS can improve 5.3% with similar complexity. Furthermore, a traditional network (e.g., ResNet) embedded with SCSC can match Swin Transformer's performance.

多樣性 · MoDELS · Analysis · 數據集 · 復合數據 ·

2023 年 8 月 14 日

#InsTag: Instruction Tagging for Diversity and Complexity Analysis

Keming Lu,Hongyi Yuan,Zheng Yuan,Runji Lin,Junyang Lin,Chuanqi Tan,Chang Zhou

Foundation language models obtain the instruction-following ability through supervised fine-tuning (SFT). Diversity and complexity are considered critical factors of a successful SFT dataset, while their definitions remain obscure and lack quantitative analyses. In this work, we propose InsTag, an open-set fine-grained tagger, to tag samples within SFT datasets based on semantics and intentions and define instruction diversity and complexity regarding tags. We obtain 6.6K tags to describe comprehensive user queries. Then we analyze popular open-sourced SFT datasets and find that the model ability grows with more diverse and complex data. Based on this observation, we propose a data selector based on InsTag to select 6K diverse and complex samples from open-source datasets and fine-tune models on InsTag-selected data. The resulting models, TagLM, outperform open-source models based on considerably larger SFT data evaluated by MT-Bench, echoing the importance of query diversity and complexity. We open-source InsTag in //github.com/OFA-Sys/InsTag.

MoDELS · Weight · 泛函 · 估計/估計量 · 樣本 ·

2023 年 8 月 13 日

csSampling: An R Package for Bayesian Models for Complex Survey Data

Ryan Hornby,Matthew R. Williams,Terrance D. Savitsky,Mahmoud Elkasabi

from arxiv, 22 pages, 5 figures

We present csSampling, an R package for estimation of Bayesian models for data collected from complex survey samples. csSampling combines functionality from the probabilistic programming language Stan (via the rstan and brms R packages) and the handling of complex survey data from the survey R package. Under this approach, the user creates a survey-weighted model in brms or provides a custom weighted model via rstan. Survey design information is provided via the svydesign function of the survey package. The cs_sampling function of csSampling estimates the weighted stan model and provides an asymptotic covariance correction for model mis-specification due to using survey sampling weights as plug-in values in the likelihood. This is often known as a ``design effect'' which is the ratio between the variance from a complex survey sample and a simple random sample of the same size. The resulting adjusted posterior draws can then be used for the usual Bayesian inference while also achieving frequentist properties of asymptotic consistency and correct uncertainty (e.g. coverage).

Python · Projection · Java · SimPLe · TOOLS ·

2023 年 8 月 13 日

Py-Tetrad and RPy-Tetrad: A New Python Interface with R Support for Tetrad Causal Search

Joseph D. Ramsey,Bryan Andrews

from arxiv, Causal Analysis Workshop Series (CAWS) 2023, 12 pages, 4 Figures, 2 Tables

We give novel Python and R interfaces for the (Java) Tetrad project for causal modeling, search, and estimation. The Tetrad project is a mainstay in the literature, having been under consistent development for over 30 years. Some of its algorithms are now classics, like PC and FCI; others are recent developments. It is increasingly the case, however, that researchers need to access the underlying Java code from Python or R. Existing methods for doing this are inadequate. We provide new, up-to-date methods using the JPype Python-Java interface and the Reticulate Python-R interface, directly solving these issues. With the addition of some simple tools and the provision of working examples for both Python and R, using JPype and Reticulate to interface Python and R with Tetrad is straightforward and intuitive.

Performer · 峰值 · 統計量 · MIMO · 可約的 ·

2023 年 8 月 13 日

Joint Beamforming and Antenna Movement Design for Moveable Antenna Systems Based on Statistical CSI

Xintai Chen,Biqian Feng,Yongpeng Wu,Derrick Wing Kwan Ng,Robert Schober

This paper studies a novel movable antenna (MA)-enhanced multiple-input multiple-output (MIMO) system to leverage the corresponding spatial degrees of freedom (DoFs) for improving the performance of wireless communications. We aim to maximize the achievable rate by jointly optimizing the MA positions and the transmit covariance matrix based on statistical channel state information (CSI). To solve the resulting design problem, we develop a constrained stochastic successive convex approximation (CSSCA) algorithm applicable for the general movement mode. Furthermore, we propose two simplified antenna movement modes, namely the linear movement mode and the planar movement mode, to facilitate efficient antenna movement and reduce the computational complexity of the CSSCA algorithm. Numerical results show that the considered MA-enhanced system can significantly improve the achievable rate compared to conventional MIMO systems employing uniform planar arrays (UPAs) and that the proposed planar movement mode performs closely to the performance upper bound achieved by the general movement mode.

任務對話系統 · 清華大學智能產業研究院 · 知識 (knowledge) · 可辨認的 · 分解的 ·

2023 年 8 月 11 日

Dialogue Possibilities between a Human Supervisor and UAM Air Traffic Management: Route Alteration

Jeongseok Kim,Kangjin Kim

from arxiv, 18 pages, 2 figures, accepted to the Advances in Artificial Intelligence and Machine Learning (AAIML) journal

This paper introduces a novel approach to detour management in Urban Air Traffic Management (UATM) using knowledge representation and reasoning. It aims to understand the complexities and requirements of UAM detours, enabling a method that quickly identifies safe and efficient routes in a carefully sampled environment. This method implemented in Answer Set Programming uses non-monotonic reasoning and a two-phase conversation between a human manager and the UATM system, considering factors like safety and potential impacts. The robustness and efficacy of the proposed method were validated through several queries from two simulation scenarios, contributing to the symbiosis of human knowledge and advanced AI techniques. The paper provides an introduction, citing relevant studies, problem formulation, solution, discussions, and concluding comments.

Microsoft Surface · 優化器 · 帶符號距離 · 表示 · HTTPS ·

2023 年 8 月 11 日

NeTO:Neural Reconstruction of Transparent Objects with Self-Occlusion Aware Refraction-Tracing

Zongcheng Li,Xiaoxiao Long,Yusen Wang,Tuo Cao,Wenping Wang,Fei Luo,Chunxia Xiao

from arxiv, Experiments involving sparse views have some flaws, mainly including Figure 1 in the introduction, Figure 7 and Table 1 in the experiments. In order to maintain correctness and fairness, we would like to retract the paper first

We present a novel method, called NeTO, for capturing 3D geometry of solid transparent objects from 2D images via volume rendering. Reconstructing transparent objects is a very challenging task, which is ill-suited for general-purpose reconstruction techniques due to the specular light transport phenomena. Although existing refraction-tracing based methods, designed specially for this task, achieve impressive results, they still suffer from unstable optimization and loss of fine details, since the explicit surface representation they adopted is difficult to be optimized, and the self-occlusion problem is ignored for refraction-tracing. In this paper, we propose to leverage implicit Signed Distance Function (SDF) as surface representation, and optimize the SDF field via volume rendering with a self-occlusion aware refractive ray tracing. The implicit representation enables our method to be capable of reconstructing high-quality reconstruction even with a limited set of images, and the self-occlusion aware strategy makes it possible for our method to accurately reconstruct the self-occluded regions. Experiments show that our method achieves faithful reconstruction results and outperforms prior works by a large margin. Visit our project page at \url{//www.xxlong.site/NeTO/}