亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Simultaneous localization and mapping (SLAM) algorithms are essential for the autonomous navigation of mobile robots. With the increasing demand for autonomous systems, it is crucial to evaluate and compare the performance of these algorithms in real-world environments. In this paper, we provide an evaluation strategy and real-world datasets to test and evaluate SLAM algorithms in complex and challenging indoor environments. Further, we analysed state-of-the-art (SOTA) SLAM algorithms based on various metrics such as absolute trajectory error, scale drift, and map accuracy and consistency. Our results demonstrate that SOTA SLAM algorithms often fail in challenging environments, with dynamic objects, transparent and reflecting surfaces. We also found that successful loop closures had a significant impact on the algorithm's performance. These findings highlight the need for further research to improve the robustness of the algorithms in real-world scenarios.

相關內容

即時定位與地圖構建(SLAM或Simultaneouslocalizationandmapping)是這樣一種技術:使得機器人和自動駕駛汽車等設備能在未知環境(沒有先驗知識的前提下)建立地圖,或者在已知環境(已給出該地圖的先驗知識)中能更新地圖,并保證這些設備能在同時追蹤它們的當前位置。

Simultaneous localisation and mapping (SLAM) play a vital role in autonomous robotics. Robotic platforms are often resource-constrained, and this limitation motivates resource-efficient SLAM implementations. While sparse visual SLAM algorithms offer good accuracy for modest hardware requirements, even these more scalable sparse approaches face limitations when applied to large-scale and long-term scenarios. A contributing factor is that the point clouds resulting from SLAM are inefficient to use and contain significant redundancy. This paper proposes the use of subset selection algorithms to reduce the map produced by sparse visual SLAM algorithms. Information-theoretic techniques have been applied to simpler related problems before, but they do not scale if applied to the full visual SLAM problem. This paper proposes a number of novel information\hyp{}theoretic utility functions for map point selection and optimises these functions using greedy algorithms. The reduced maps are evaluated using practical data alongside an existing visual SLAM implementation (ORB-SLAM 2). Approximate selection techniques proposed in this paper achieve trajectory accuracy comparable to an offline baseline while being suitable for online use. These techniques enable the practical reduction of maps for visual SLAM with competitive trajectory accuracy. Results also demonstrate that SLAM front-end performance can significantly impact the performance of map point selection. This shows the importance of testing map point selection with a front-end implementation. To exploit this, this paper proposes an approach that includes a model of the front-end in the utility function when additional information is available. This approach outperforms alternatives on applicable datasets and highlights future research directions.

Inertia drift is an aggressive transitional driving maneuver, which is challenging due to the high nonlinearity of the system and the stringent requirement on control and planning performance. This paper presents a solution for the consecutive inertia drift of an autonomous RC car based on primitive-based planning and data-driven control. The planner generates complex paths via the concatenation of path segments called primitives, and the controller eases the burden on feedback by interpolating between multiple real trajectories with different initial conditions into one near-feasible reference trajectory. The proposed strategy is capable of drifting through various paths containing consecutive turns, which is validated in both simulation and reality.

Algorithms for autonomous navigation in environments without Global Navigation Satellite System (GNSS) coverage mainly rely on onboard perception systems. These systems commonly incorporate sensors like cameras and Light Detection and Rangings (LiDARs), the performance of which may degrade in the presence of aerosol particles. Thus, there is a need of fusing acquired data from these sensors with data from Radio Detection and Rangings (RADARs) which can penetrate through such particles. Overall, this will improve the performance of localization and collision avoidance algorithms under such environmental conditions. This paper introduces a multimodal dataset from the harsh and unstructured underground environment with aerosol particles. A detailed description of the onboard sensors and the environment, where the dataset is collected are presented to enable full evaluation of acquired data. Furthermore, the dataset contains synchronized raw data measurements from all onboard sensors in Robot Operating System (ROS) format to facilitate the evaluation of navigation, and localization algorithms in such environments. In contrast to the existing datasets, the focus of this paper is not only to capture both temporal and spatial data diversities but also to present the impact of harsh conditions on captured data. Therefore, to validate the dataset, a preliminary comparison of odometry from onboard LiDARs is presented.

Knitted sensors frequently suffer from inconsistencies due to innate effects such as offset, relaxation, and drift. These properties, in combination, make it challenging to reliably map from sensor data to physical actuation. In this paper, we demonstrate a method for counteracting this by applying processing using a minimal artificial neural network (ANN) in combination with straightforward pre-processing. We apply a number of exponential smoothing filters on a re-sampled sensor signal, to produce features that preserve different levels of historical sensor data and, in combination, represent an adequate state of previous sensor actuation. By training a three-layer ANN with a total of 8 neurons, we manage to significantly improve the mapping between sensor reading and actuation force. Our findings also show that our technique translates to sensors of reasonably different composition in terms of material and structure, and it can furthermore be applied to related physical features such as strain.

Understanding visually-rich business documents to extract structured data and automate business workflows has been receiving attention both in academia and industry. Although recent multi-modal language models have achieved impressive results, we find that existing benchmarks do not reflect the complexity of real documents seen in industry. In this work, we identify the desiderata for a more comprehensive benchmark and propose one we call Visually Rich Document Understanding (VRDU). VRDU contains two datasets that represent several challenges: rich schema including diverse data types as well as hierarchical entities, complex templates including tables and multi-column layouts, and diversity of different layouts (templates) within a single document type. We design few-shot and conventional experiment settings along with a carefully designed matching algorithm to evaluate extraction results. We report the performance of strong baselines and offer three observations: (1) generalizing to new document templates is still very challenging, (2) few-shot performance has a lot of headroom, and (3) models struggle with hierarchical fields such as line-items in an invoice. We plan to open source the benchmark and the evaluation toolkit. We hope this helps the community make progress on these challenging tasks in extracting structured data from visually rich documents.

Decision-making algorithms are being used in important decisions, such as who should be enrolled in health care programs and be hired. Even though these systems are currently deployed in high-stakes scenarios, many of them cannot explain their decisions. This limitation has prompted the Explainable Artificial Intelligence (XAI) initiative, which aims to make algorithms explainable to comply with legal requirements, promote trust, and maintain accountability. This paper questions whether and to what extent explainability can help solve the responsibility issues posed by autonomous AI systems. We suggest that XAI systems that provide post-hoc explanations could be seen as blameworthy agents, obscuring the responsibility of developers in the decision-making process. Furthermore, we argue that XAI could result in incorrect attributions of responsibility to vulnerable stakeholders, such as those who are subjected to algorithmic decisions (i.e., patients), due to a misguided perception that they have control over explainable algorithms. This conflict between explainability and accountability can be exacerbated if designers choose to use algorithms and patients as moral and legal scapegoats. We conclude with a set of recommendations for how to approach this tension in the socio-technical process of algorithmic decision-making and a defense of hard regulation to prevent designers from escaping responsibility.

Despite the recent progress in deep learning, most approaches still go for a silo-like solution, focusing on learning each task in isolation: training a separate neural network for each individual task. Many real-world problems, however, call for a multi-modal approach and, therefore, for multi-tasking models. Multi-task learning (MTL) aims to leverage useful information across tasks to improve the generalization capability of a model. This thesis is concerned with multi-task learning in the context of computer vision. First, we review existing approaches for MTL. Next, we propose several methods that tackle important aspects of multi-task learning. The proposed methods are evaluated on various benchmarks. The results show several advances in the state-of-the-art of multi-task learning. Finally, we discuss several possibilities for future work.

Causality can be described in terms of a structural causal model (SCM) that carries information on the variables of interest and their mechanistic relations. For most processes of interest the underlying SCM will only be partially observable, thus causal inference tries to leverage any exposed information. Graph neural networks (GNN) as universal approximators on structured input pose a viable candidate for causal learning, suggesting a tighter integration with SCM. To this effect we present a theoretical analysis from first principles that establishes a novel connection between GNN and SCM while providing an extended view on general neural-causal models. We then establish a new model class for GNN-based causal inference that is necessary and sufficient for causal effect identification. Our empirical illustration on simulations and standard benchmarks validate our theoretical proofs.

Detection and recognition of text in natural images are two main problems in the field of computer vision that have a wide variety of applications in analysis of sports videos, autonomous driving, industrial automation, to name a few. They face common challenging problems that are factors in how text is represented and affected by several environmental conditions. The current state-of-the-art scene text detection and/or recognition methods have exploited the witnessed advancement in deep learning architectures and reported a superior accuracy on benchmark datasets when tackling multi-resolution and multi-oriented text. However, there are still several remaining challenges affecting text in the wild images that cause existing methods to underperform due to there models are not able to generalize to unseen data and the insufficient labeled data. Thus, unlike previous surveys in this field, the objectives of this survey are as follows: first, offering the reader not only a review on the recent advancement in scene text detection and recognition, but also presenting the results of conducting extensive experiments using a unified evaluation framework that assesses pre-trained models of the selected methods on challenging cases, and applies the same evaluation criteria on these techniques. Second, identifying several existing challenges for detecting or recognizing text in the wild images, namely, in-plane-rotation, multi-oriented and multi-resolution text, perspective distortion, illumination reflection, partial occlusion, complex fonts, and special characters. Finally, the paper also presents insight into the potential research directions in this field to address some of the mentioned challenges that are still encountering scene text detection and recognition techniques.

We present a monocular Simultaneous Localization and Mapping (SLAM) using high level object and plane landmarks, in addition to points. The resulting map is denser, more compact and meaningful compared to point only SLAM. We first propose a high order graphical model to jointly infer the 3D object and layout planes from single image considering occlusions and semantic constraints. The extracted cuboid object and layout planes are further optimized in a unified SLAM framework. Objects and planes can provide more semantic constraints such as Manhattan and object supporting relationships compared to points. Experiments on various public and collected datasets including ICL NUIM and TUM mono show that our algorithm can improve camera localization accuracy compared to state-of-the-art SLAM and also generate dense maps in many structured environments.

北京阿比特科技有限公司