Autonomous parking (AP) is an emering technique to navigate an intelligent vehicle to a parking space without any human intervention. Existing AP methods based on mathematical optimization or machine learning may lead to potential failures due to either excessive execution time or lack of generalization. To fill this gap, this paper proposes an integrated constrained optimization and imitation learning (iCOIL) approach to achieve efficient and reliable AP. The iCOIL method has two candidate working modes, i.e., CO and IL, and adopts a hybrid scenario analysis (HSA) model to determine the better mode under various scenarios. We implement and verify iCOIL on the Macao Car Racing Metaverse (MoCAM) platform. Results show that iCOIL properly adapts to different scenarios during the entire AP procedure, and achieves significantly larger success rates than other benchmarks.
Whereas dedicated scene representations are required for each different task in conventional robotic systems, this paper demonstrates that a unified representation can be used directly for multiple key tasks. We propose the Log-Gaussian Process Implicit Surface for Mapping, Odometry and Planning (Log-GPIS-MOP): a probabilistic framework for surface reconstruction, localisation and navigation based on a unified representation. Our framework applies a logarithmic transformation to a Gaussian Process Implicit Surface (GPIS) formulation to recover a global representation that accurately captures the Euclidean distance field with gradients and, at the same time, the implicit surface. By directly estimating the distance field and its gradient through Log-GPIS inference, the proposed incremental odometry technique computes the optimal alignment of an incoming frame and fuses it globally to produce a map. Concurrently, an optimisation-based planner computes a safe collision-free path using the same Log-GPIS surface representation. We validate the proposed framework on simulated and real datasets in 2D and 3D and benchmark against the state-of-the-art approaches. Our experiments show that Log-GPIS-MOP produces competitive results in sequential odometry, surface mapping and obstacle avoidance.
Autonomous driving technology has five levels, from L0 to L5. Currently, only the L2 level (partial automation) can be achieved, and there is a long way to go before reaching the final level of L5 (full automation). The key to crossing these levels lies in training the autonomous driving model. However, relying solely on real-world road data to train the model is far from enough and consumes a great deal of resources. Although there are already examples of training autonomous driving models through simulators that simulate real-world scenarios, these scenarios require complete manual construction. Directly converting 3D scenes from road network formats will lack a large amount of detail and cannot be used as training sets. Underground parking garage static scenario simulation is regarded as a procedural content generation (PCG) problem. This paper will use the Sarsa algorithm to solve procedural content generation on underground garage structures.
Developers interrupting their participation in a project might slowly forget critical information about the code, such as its intended purpose, structure, the impact of external dependencies, and the approach used for implementation. Forgetting the implementation details can have detrimental effects on software maintenance, comprehension, knowledge sharing, and developer productivity, resulting in bugs, and other issues that can negatively influence the software development process. Therefore, it is crucial to ensure that developers have a clear understanding of the codebase and can work efficiently and effectively even after long interruptions. This registered report proposes an empirical study aimed at investigating the impact of the developer's activity breaks duration and different code quality properties. In particular, we aim at understanding if the amount of activity in a project impact the code quality, and if developers with different activity profiles show different impacts on code quality. The results might be useful to understand if it is beneficial to promote the practice of developing multiple projects in parallel, or if it is more beneficial to reduce the number of projects each developer contributes.
Physics-informed neural networks (PINNs) are a newly emerging research frontier in machine learning, which incorporate certain physical laws that govern a given data set, e.g., those described by partial differential equations (PDEs), into the training of the neural network (NN) based on such a data set. In PINNs, the NN acts as the solution approximator for the PDE while the PDE acts as the prior knowledge to guide the NN training, leading to the desired generalization performance of the NN when facing the limited availability of training data. However, training PINNs is a non-trivial task largely due to the complexity of the loss composed of both NN and physical law parts. In this work, we propose a new PINN training framework based on the multi-task optimization (MTO) paradigm. Under this framework, multiple auxiliary tasks are created and solved together with the given (main) task, where the useful knowledge from solving one task is transferred in an adaptive mode to assist in solving some other tasks, aiming to uplift the performance of solving the main task. We implement the proposed framework and apply it to train the PINN for addressing the traffic density prediction problem. Experimental results demonstrate that our proposed training framework leads to significant performance improvement in comparison to the traditional way of training the PINN.
The majority of human detection methods rely on the sensor using visible lights (e.g., RGB cameras) but such sensors are limited in scenarios with degraded vision conditions. In this paper, we present a multimodal human detection system that combines portable thermal cameras and single-chip mmWave radars. To mitigate the noisy detection features caused by the low contrast of thermal cameras and the multi-path noise of radar point clouds, we propose a Bayesian feature extractor and a novel uncertainty-guided fusion method that surpasses a variety of competing methods, either single-modal or multi-modal. We evaluate the proposed method on real-world data collection and demonstrate that our approach outperforms the state-of-the-art methods by a large margin.
Autonomous racing is a research field gaining large popularity, as it pushes autonomous driving algorithms to their limits and serves as a catalyst for general autonomous driving. For scaled autonomous racing platforms, the computational constraint and complexity often limit the use of Model Predictive Control (MPC). As a consequence, geometric controllers are the most frequently deployed controllers. They prove to be performant while yielding implementation and operational simplicity. Yet, they inherently lack the incorporation of model dynamics, thus limiting the race car to a velocity domain where tire slip can be neglected. This paper presents Model- and Acceleration-based Pursuit (MAP) a high-performance model-based trajectory tracking algorithm that preserves the simplicity of geometric approaches while leveraging tire dynamics. The proposed algorithm allows accurate tracking of a trajectory at unprecedented velocities compared to State-of-the-Art (SotA) geometric controllers. The MAP controller is experimentally validated and outperforms the reference geometric controller four-fold in terms of lateral tracking error, yielding a tracking error of 0.055m at tested speeds up to 11m/s.
Choosing how to encode a real-world problem as a machine learning task is an important design decision in machine learning. The task of glacier calving front modeling has often been approached as a semantic segmentation task. Recent studies have shown that combining segmentation with edge detection can improve the accuracy of calving front detectors. Building on this observation, we completely rephrase the task as a contour tracing problem and propose a model for explicit contour detection that does not incorporate any dense predictions as intermediate steps. The proposed approach, called ``Charting Outlines by Recurrent Adaptation'' (COBRA), combines Convolutional Neural Networks (CNNs) for feature extraction and active contour models for the delineation. By training and evaluating on several large-scale datasets of Greenland's outlet glaciers, we show that this approach indeed outperforms the aforementioned methods based on segmentation and edge-detection. Finally, we demonstrate that explicit contour detection has benefits over pixel-wise methods when quantifying the models' prediction uncertainties. The project page containing the code and animated model predictions can be found at \url{//khdlr.github.io/COBRA/}.
The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving.
Autonomous driving has achieved a significant milestone in research and development over the last decade. There is increasing interest in the field as the deployment of self-operating vehicles on roads promises safer and more ecologically friendly transportation systems. With the rise of computationally powerful artificial intelligence (AI) techniques, autonomous vehicles can sense their environment with high precision, make safe real-time decisions, and operate more reliably without human interventions. However, intelligent decision-making in autonomous cars is not generally understandable by humans in the current state of the art, and such deficiency hinders this technology from being socially acceptable. Hence, aside from making safe real-time decisions, the AI systems of autonomous vehicles also need to explain how these decisions are constructed in order to be regulatory compliant across many jurisdictions. Our study sheds a comprehensive light on developing explainable artificial intelligence (XAI) approaches for autonomous vehicles. In particular, we make the following contributions. First, we provide a thorough overview of the present gaps with respect to explanations in the state-of-the-art autonomous vehicle industry. We then show the taxonomy of explanations and explanation receivers in this field. Thirdly, we propose a framework for an architecture of end-to-end autonomous driving systems and justify the role of XAI in both debugging and regulating such systems. Finally, as future research directions, we provide a field guide on XAI approaches for autonomous driving that can improve operational safety and transparency towards achieving public approval by regulators, manufacturers, and all engaged stakeholders.
Search in social networks such as Facebook poses different challenges than in classical web search: besides the query text, it is important to take into account the searcher's context to provide relevant results. Their social graph is an integral part of this context and is a unique aspect of Facebook search. While embedding-based retrieval (EBR) has been applied in eb search engines for years, Facebook search was still mainly based on a Boolean matching model. In this paper, we discuss the techniques for applying EBR to a Facebook Search system. We introduce the unified embedding framework developed to model semantic embeddings for personalized search, and the system to serve embedding-based retrieval in a typical search system based on an inverted index. We discuss various tricks and experiences on end-to-end optimization of the whole system, including ANN parameter tuning and full-stack optimization. Finally, we present our progress on two selected advanced topics about modeling. We evaluated EBR on verticals for Facebook Search with significant metrics gains observed in online A/B experiments. We believe this paper will provide useful insights and experiences to help people on developing embedding-based retrieval systems in search engines.