99视频在线播放喷射_欧美人在线一区二区三区_国产成人短视频在线播放_亚洲国产一区二区三区综合_日本有码视频中文字幕_色老板在线成人免费视频_亚洲免费一区二区三区在线观看

In the field of indoor robotics, accurately navigating and mapping in dynamic environments using point clouds can be a challenging task due to the presence of dynamic points. These dynamic points are often represented by people in indoor environments, but in industrial settings with moving machinery, there can be various types of dynamic points. This study introduces DynaHull, a novel technique designed to enhance indoor mapping accuracy by effectively removing dynamic points from point clouds. DynaHull works by leveraging the observation that, over multiple scans, stationary points have a higher density compared to dynamic ones. Furthermore, DynaHull addresses mapping challenges related to unevenly distributed points by clustering the map into smaller sections. In each section, the density factor of each point is determined by dividing the number of neighbors by the volume these neighboring points occupy using a convex hull method. The algorithm removes the dynamic points using an adaptive threshold based on the point count of each cluster, thus reducing the false positives. The performance of DynaHull was compared to state-of-the-art techniques, such as ERASOR, Removert, OctoMap, and a baseline statistical outlier removal from Open3D, by comparing each method to the ground truth map created during a low activity period in which only a few dynamic points were present. The results indicated that DynaHull outperformed these techniques in various metrics, noticeably in the Earth Mover's Distance. This research contributes to indoor robotics by providing efficient methods for dynamic point removal, essential for accurate mapping and localization in dynamic environments.

相關內容

點云(yun)

關注 48

根(gen)據激(ji)光(guang)(guang)測(ce)(ce)(ce)量(liang)原(yuan)(yuan)理(li)得到(dao)的(de)點(dian)(dian)(dian)云，包(bao)括(kuo)(kuo)三維(wei)坐(zuo)(zuo)標（XYZ）和激(ji)光(guang)(guang)反射強(qiang)(qiang)度（Intensity）。根(gen)據攝影(ying)測(ce)(ce)(ce)量(liang)原(yuan)(yuan)理(li)得到(dao)的(de)點(dian)(dian)(dian)云，包(bao)括(kuo)(kuo)三維(wei)坐(zuo)(zuo)標（XYZ）和顏色信息(xi)（RGB）。結合(he)激(ji)光(guang)(guang)測(ce)(ce)(ce)量(liang)和攝影(ying)測(ce)(ce)(ce)量(liang)原(yuan)(yuan)理(li)得到(dao)點(dian)(dian)(dian)云，包(bao)括(kuo)(kuo)三維(wei)坐(zuo)(zuo)標（XYZ）、激(ji)光(guang)(guang)反射強(qiang)(qiang)度（Intensity）和顏色信息(xi)（RGB）。在獲取物體(ti)表(biao)面每個(ge)采樣(yang)點(dian)(dian)(dian)的(de)空(kong)間坐(zuo)(zuo)標后，得到(dao)的(de)是一個(ge)點(dian)(dian)(dian)的(de)集合(he)，稱之(zhi)為“點(dian)(dian)(dian)云”(Point Cloud)

回合 · contrastive · 機器人 · MoDELS · Performer ·

2024 年 2 月 27 日

4CNet: A Confidence-Aware, Contrastive, Conditional, Consistency Model for Robot Map Prediction in Multi-Robot Environments

Aaron Hao Tan,Siddarth Narasimhan,Goldie Nejat

from arxiv, 14 pages, 10 figures

Mobile robots in unknown cluttered environments with irregularly shaped obstacles often face sensing, energy, and communication challenges which directly affect their ability to explore these environments. In this paper, we introduce a novel deep learning method, Confidence-Aware Contrastive Conditional Consistency Model (4CNet), for mobile robot map prediction during resource-limited exploration in multi-robot environments. 4CNet uniquely incorporates: 1) a conditional consistency model for map prediction in irregularly shaped unknown regions, 2) a contrastive map-trajectory pretraining framework for a trajectory encoder that extracts spatial information from the trajectories of nearby robots during map prediction, and 3) a confidence network to measure the uncertainty of map prediction for effective exploration under resource constraints. We incorporate 4CNet within our proposed robot exploration with map prediction architecture, 4CNet-E. We then conduct extensive comparison studies with 4CNet-E and state-of-the-art heuristic and learning methods to investigate both map prediction and exploration performance in environments consisting of uneven terrain and irregularly shaped obstacles. Results showed that 4CNet-E obtained statistically significant higher prediction accuracy and area coverage with varying environment sizes, number of robots, energy budgets, and communication limitations. Real-world mobile robot experiments were performed and validated the feasibility and generalizability of 4CNet-E for mobile robot map prediction and exploration.

視頻描述生成（Video Caption） · 多峰值 · Learning · 知識 (knowledge) · Performer ·

2024 年 2 月 27 日

MCF-VC: Mitigate Catastrophic Forgetting in Class-Incremental Learning for Multimodal Video Captioning

Huiyu Xiong,Lanxiao Wang,Heqian Qiu,Taijin Zhao,Benliu Qiu,Hongliang Li

from arxiv, 13 pages

To address the problem of catastrophic forgetting due to the invisibility of old categories in sequential input, existing work based on relatively simple categorization tasks has made some progress. In contrast, video captioning is a more complex task in multimodal scenario, which has not been explored in the field of incremental learning. After identifying this stability-plasticity problem when analyzing video with sequential input, we originally propose a method to Mitigate Catastrophic Forgetting in class-incremental learning for multimodal Video Captioning (MCF-VC). As for effectively maintaining good performance on old tasks at the macro level, we design Fine-grained Sensitivity Selection (FgSS) based on the Mask of Linear's Parameters and Fisher Sensitivity to pick useful knowledge from old tasks. Further, in order to better constrain the knowledge characteristics of old and new tasks at the specific feature level, we have created the Two-stage Knowledge Distillation (TsKD), which is able to learn the new task well while weighing the old task. Specifically, we design two distillation losses, which constrain the cross modal semantic information of semantic attention feature map and the textual information of the final outputs respectively, so that the inter-model and intra-model stylized knowledge of the old class is retained while learning the new class. In order to illustrate the ability of our model to resist forgetting, we designed a metric CIDER_t to detect the stage forgetting rate. Our experiments on the public dataset MSR-VTT show that the proposed method significantly resists the forgetting of previous tasks without replaying old samples, and performs well on the new task.

Facebook AI Research · GROUP · Everything（軟件） · 情景 · MoDELS ·

2024 年 2 月 26 日

FRAPPé: A Group Fairness Framework for Post-Processing Everything

Alexandru ?ifrea,Preethi Lahoti,Ben Packer,Yoni Halpern,Ahmad Beirami,Flavien Prost

from arxiv, Presubmission

Despite achieving promising fairness-error trade-offs, in-processing mitigation techniques for group fairness cannot be employed in numerous practical applications with limited computation resources or no access to the training pipeline of the prediction model. In these situations, post-processing is a viable alternative. However, current methods are tailored to specific problem settings and fairness definitions and hence, are not as broadly applicable as in-processing. In this work, we propose a framework that turns any regularized in-processing method into a post-processing approach. This procedure prescribes a way to obtain post-processing techniques for a much broader range of problem settings than the prior post-processing literature. We show theoretically and through extensive experiments that our framework preserves the good fairness-error trade-offs achieved with in-processing and can improve over the effectiveness of prior post-processing methods. Finally, we demonstrate several advantages of a modular mitigation strategy that disentangles the training of the prediction model from the fairness mitigation, including better performance on tasks with partial group labels.

回合 · 服務器 · ForCES · Continuity · AIM ·

2024 年 2 月 26 日

Triad: Trusted Timestamps in Untrusted Environments

Gabriel P. Fernandez,Andrey Brito,Christof Fetzer

We aim to provide trusted time measurement mechanisms to applications and cloud infrastructure deployed in environments that could harbor potential adversaries, including the hardware infrastructure provider. Despite Trusted Execution Environments (TEEs) providing multiple security functionalities, timestamps from the Operating System are not covered. Nevertheless, some services require time for validating permissions or ordering events. To address that need, we introduce Triad, a trusted timestamp dispatcher of time readings. The solution provides trusted timestamps enforced by mutually supportive enclave-based clock servers that create a continuous trusted timeline. We leverage enclave properties such as forced exits and CPU-based counters to mitigate attacks on the server's timestamp counters. Triad produces trusted, confidential, monotonically-increasing timestamps with bounded error and desirable, non-trivial properties. Our implementation relies on Intel SGX and SCONE, allowing transparent usage. We evaluate Triad's error and behavior in multiple dimensions.

狀態估計 · 估計/估計量 · 機器人 · SLAM · 優化器 ·

2024 年 2 月 23 日

CoLRIO: LiDAR-Ranging-Inertial Centralized State Estimation for Robotic Swarms

Shipeng Zhong,Hongbo Chen,Yuhua Qi,Dapeng Feng,Zhiqiang Chen,Jin Wu,Weisong Wen,Ming Liu

Collaborative state estimation using different heterogeneous sensors is a fundamental prerequisite for robotic swarms operating in GPS-denied environments, posing a significant research challenge. In this paper, we introduce a centralized system to facilitate collaborative LiDAR-ranging-inertial state estimation, enabling robotic swarms to operate without the need for anchor deployment. The system efficiently distributes computationally intensive tasks to a central server, thereby reducing the computational burden on individual robots for local odometry calculations. The server back-end establishes a global reference by leveraging shared data and refining joint pose graph optimization through place recognition, global optimization techniques, and removal of outlier data to ensure precise and robust collaborative state estimation. Extensive evaluations of our system, utilizing both publicly available datasets and our custom datasets, demonstrate significant enhancements in the accuracy of collaborative SLAM estimates. Moreover, our system exhibits remarkable proficiency in large-scale missions, seamlessly enabling ten robots to collaborate effectively in performing SLAM tasks. In order to contribute to the research community, we will make our code open-source and accessible at \url{//github.com/PengYu-team/Co-LRIO}.

MoDELS · Vision · 去噪 · 前向 · 可辨認的 ·

2022 年 9 月 10 日

Diffusion Models in Vision: A Survey

Florinel-Alin Croitoru,Vlad Hondru,Radu Tudor Ionescu,Mubarak Shah

from arxiv, 20 pages, 3 figures

Denoising diffusion models represent a recent emerging topic in computer vision, demonstrating remarkable results in the area of generative modeling. A diffusion model is a deep generative model that is based on two stages, a forward diffusion stage and a reverse diffusion stage. In the forward diffusion stage, the input data is gradually perturbed over several steps by adding Gaussian noise. In the reverse stage, a model is tasked at recovering the original input data by learning to gradually reverse the diffusion process, step by step. Diffusion models are widely appreciated for the quality and diversity of the generated samples, despite their known computational burdens, i.e. low speeds due to the high number of steps involved during sampling. In this survey, we provide a comprehensive review of articles on denoising diffusion models applied in vision, comprising both theoretical and practical contributions in the field. First, we identify and present three generic diffusion modeling frameworks, which are based on denoising diffusion probabilistic models, noise conditioned score networks, and stochastic differential equations. We further discuss the relations between diffusion models and other deep generative models, including variational auto-encoders, generative adversarial networks, energy-based models, autoregressive models and normalizing flows. Then, we introduce a multi-perspective categorization of diffusion models applied in computer vision. Finally, we illustrate the current limitations of diffusion models and envision some interesting directions for future research.

MoDELS · 圖 · 知識圖譜 · entity · 鏈路預測 ·

2021 年 10 月 27 日

Rot-Pro: Modeling Transitivity by Projection in Knowledge Graph Embedding

Tengwei Song,Jie Luo,Lei Huang

from arxiv, 10 pages, 6 figures, to be published in NeurIPS 2021

Knowledge graph embedding models learn the representations of entities and relations in the knowledge graphs for predicting missing links (relations) between entities. Their effectiveness are deeply affected by the ability of modeling and inferring different relation patterns such as symmetry, asymmetry, inversion, composition and transitivity. Although existing models are already able to model many of these relations patterns, transitivity, a very common relation pattern, is still not been fully supported. In this paper, we first theoretically show that the transitive relations can be modeled with projections. We then propose the Rot-Pro model which combines the projection and relational rotation together. We prove that Rot-Pro can infer all the above relation patterns. Experimental results show that the proposed Rot-Pro model effectively learns the transitivity pattern and achieves the state-of-the-art results on the link prediction task in the datasets containing transitive relations.

圖 · 圖形處理器 · 結點 · Neural Networks · Networking ·

2020 年 2 月 5 日

MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding

Xinyu Fu,Jiani Zhang,Ziqiao Meng,Irwin King

from arxiv, To appear at WWW 2020; 11 pages, 4 figures

A large number of real-world graphs or networks are inherently heterogeneous, involving a diversity of node types and relation types. Heterogeneous graph embedding is to embed rich structural and semantic information of a heterogeneous graph into low-dimensional node representations. Existing models usually define multiple metapaths in a heterogeneous graph to capture the composite relations and guide neighbor selection. However, these models either omit node content features, discard intermediate nodes along the metapath, or only consider one metapath. To address these three limitations, we propose a new model named Metapath Aggregated Graph Neural Network (MAGNN) to boost the final performance. Specifically, MAGNN employs three major components, i.e., the node content transformation to encapsulate input node attributes, the intra-metapath aggregation to incorporate intermediate semantic nodes, and the inter-metapath aggregation to combine messages from multiple metapaths. Extensive experiments on three real-world heterogeneous graph datasets for node classification, node clustering, and link prediction show that MAGNN achieves more accurate prediction results than state-of-the-art baselines.

state-of-the-art · 可理解性 · BERT · 去噪自編碼器 · Performer ·

2019 年 6 月 19 日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Zhilin Yang,Zihang Dai,Yiming Yang,Jaime Carbonell,Ruslan Salakhutdinov,Quoc V. Le

from arxiv, Pretrained models and code are available at //github.com/zihangdai/xlnet

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy. In light of these pros and cons, we propose XLNet, a generalized autoregressive pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive formulation. Furthermore, XLNet integrates ideas from Transformer-XL, the state-of-the-art autoregressive model, into pretraining. Empirically, XLNet outperforms BERT on 20 tasks, often by a large margin, and achieves state-of-the-art results on 18 tasks including question answering, natural language inference, sentiment analysis, and document ranking.

Networking · 基 · 遷移學習 · MoDELS · 前饋網絡 ·

2018 年 4 月 20 日

CoNet: Collaborative Cross Networks for Cross-Domain Recommendation

Guangneng Hu,Yu Zhang,Qiang Yang

The cross-domain recommendation technique is an effective way of alleviating the data sparsity in recommender systems by leveraging the knowledge from relevant domains. Transfer learning is a class of algorithms underlying these techniques. In this paper, we propose a novel transfer learning approach for cross-domain recommendation by using neural networks as the base model. We assume that hidden layers in two base networks are connected by cross mappings, leading to the collaborative cross networks (CoNet). CoNet enables dual knowledge transfer across domains by introducing cross connections from one base network to another and vice versa. CoNet is achieved in multi-layer feedforward networks by adding dual connections and joint loss functions, which can be trained efficiently by back-propagation. The proposed model is evaluated on two real-world datasets and it outperforms baseline models by relative improvements of 3.56\% in MRR and 8.94\% in NDCG, respectively.