This paper presents a deep learning approach to aid dead-reckoning (DR) navigation using a limited sensor suite. A Recurrent Neural Network (RNN) was developed to predict the relative horizontal velocities of an Autonomous Underwater Vehicle (AUV) using data from an IMU, pressure sensor, and control inputs. The RNN network is trained using experimental data, where a doppler velocity logger (DVL) provided ground truth velocities. The predictions of the relative velocities were implemented in a dead-reckoning algorithm to approximate north and east positions. The studies in this paper were twofold I) Experimental data from a Long-Range AUV was investigated. Datasets from a series of surveys in Monterey Bay, California (U.S) were used to train and test the RNN network. II) The second study explore datasets generated by a simulated autonomous underwater glider. Environmental variables e.g ocean currents were implemented in the simulation to reflect real ocean conditions. The proposed neural network approach to DR navigation was compared to the on-board navigation system and ground truth simulated positions.
This paper presents a machine learning approach to model the electric consumption of electric vehicles at macroscopic level, i.e., in the absence of a speed profile, while preserving microscopic level accuracy. For this work, we leveraged a high-performance, agent-based transportation tool to model trips that occur in the Greater Chicago region under various scenario changes, along with physics-based modeling and simulation tools to provide high-fidelity energy consumption values. The generated results constitute a very large dataset of vehicle-route energy outcomes that capture variability in vehicle and routing setting, and in which high-fidelity time series of vehicle speed dynamics is masked. We show that although all internal dynamics that affect energy consumption are masked, it is possible to learn aggregate-level energy consumption values quite accurately with a deep learning approach. When large-scale data is available, and with carefully tailored feature engineering, a well-designed model can overcome and retrieve latent information. This model has been deployed and integrated within POLARIS Transportation System Simulation Tool to support real-time behavioral transportation models for individual charging decision-making, and rerouting of electric vehicles.
Local or reactive navigation is essential for autonomous mobile robots which operate in an indoor environment. Techniques such as SLAM, computer vision require significant computational power which increases cost. Similarly, using rudimentary methods makes the robot susceptible to inconsistent behavior. This paper aims to develop a robot that balances cost and accuracy by using machine learning to predict the best obstacle avoidance move based on distance inputs from four ultrasonic sensors that are strategically mounted on the front, front-left, front-right, and back of the robot. The underlying hardware consists of an Arduino Uno and a Raspberry Pi 3B. The machine learning model is first trained on the data collected by the robot. Then the Arduino continuously polls the sensors and calculates the distance values, and in case of critical need for avoidance, a suitable maneuver is made by the Arduino. In other scenarios, sensor data is sent to the Raspberry Pi using a USB connection and the machine learning model generates the best move for navigation, which is sent to the Arduino for driving motors accordingly. The system is mounted on a 2-WD robot chassis and tested in a cluttered indoor setting with most impressive results.
There is a growing demand for mobile robots to operate in more variable environments, where guaranteeing safe robot navigation is a priority, in addition to time performance. To achieve this, current solutions for local planning use a specific configuration tuned to the characteristics of the application environment. In this paper, we present an approach for developing quality models that can be used by a self-adaptation framework to adapt the local planner configuration at run-time based on the perceived environment. We contribute a definition of a safety model that predicts the safety of a navigation configuration given the perceived environment. Experiments have been performed in a realistic navigation scenario for a retail application to validate the obtained models and demonstrate their integration in a self-adaptation framework.
Vision-and-Language Navigation (VLN) is a task where an agent navigates in an embodied indoor environment under human instructions. Previous works ignore the distribution of sample difficulty and we argue that this potentially degrade their agent performance. To tackle this issue, we propose a novel curriculum-based training paradigm for VLN tasks that can balance human prior knowledge and agent learning progress about training samples. We develop the principle of curriculum design and re-arrange the benchmark Room-to-Room (R2R) dataset to make it suitable for curriculum training. Experiments show that our method is model-agnostic and can significantly improve the performance, the generalizability, and the training efficiency of current state-of-the-art navigation agents without increasing model complexity.
Breakthroughs in machine learning in the last decade have led to `digital intelligence', i.e. machine learning models capable of learning from vast amounts of labeled data to perform several digital tasks such as speech recognition, face recognition, machine translation and so on. The goal of this thesis is to make progress towards designing algorithms capable of `physical intelligence', i.e. building intelligent autonomous navigation agents capable of learning to perform complex navigation tasks in the physical world involving visual perception, natural language understanding, reasoning, planning, and sequential decision making. Despite several advances in classical navigation methods in the last few decades, current navigation agents struggle at long-term semantic navigation tasks. In the first part of the thesis, we discuss our work on short-term navigation using end-to-end reinforcement learning to tackle challenges such as obstacle avoidance, semantic perception, language grounding, and reasoning. In the second part, we present a new class of navigation methods based on modular learning and structured explicit map representations, which leverage the strengths of both classical and end-to-end learning methods, to tackle long-term navigation tasks. We show that these methods are able to effectively tackle challenges such as localization, mapping, long-term planning, exploration and learning semantic priors. These modular learning methods are capable of long-term spatial and semantic understanding and achieve state-of-the-art results on various navigation tasks.
Localizing the camera in a known indoor environment is a key building block for scene mapping, robot navigation, AR, etc. Recent advances estimate the camera pose via optimization over the 2D/3D-3D correspondences established between the coordinates in 2D/3D camera space and 3D world space. Such a mapping is estimated with either a convolution neural network or a decision tree using only the static input image sequence, which makes these approaches vulnerable to dynamic indoor environments that are quite common yet challenging in the real world. To address the aforementioned issues, in this paper, we propose a novel outlier-aware neural tree which bridges the two worlds, deep learning and decision tree approaches. It builds on three important blocks; (a) a hierarchical space partition over the indoor scene to construct the decision tree; (b) a neural routing function, implemented as a deep classification network, employed for better 3D scene understanding; and (c) an outlier rejection module used to filter out dynamic points during the hierarchical routing process. Our proposed algorithm is evaluated on the RIO-10 benchmark developed for camera relocalization in dynamic indoor environment. It achieves robust neural routing through space partitions and outperforms the state-of-the-art approaches by around 30\% on camera pose accuracy, while running comparably fast for evaluation.
Since DARPA Grand Challenges (rural) in 2004/05 and Urban Challenges in 2007, autonomous driving has been the most active field of AI applications. Almost at the same time, deep learning has made breakthrough by several pioneers, three of them (also called fathers of deep learning), Hinton, Bengio and LeCun, won ACM Turin Award in 2019. This is a survey of autonomous driving technologies with deep learning methods. We investigate the major fields of self-driving systems, such as perception, mapping and localization, prediction, planning and control, simulation, V2X and safety etc. Due to the limited space, we focus the analysis on several key areas, i.e. 2D and 3D object detection in perception, depth estimation from cameras, multiple sensor fusion on the data, feature and task level respectively, behavior modelling and prediction of vehicle driving and pedestrian trajectories.
Although deep reinforcement learning (deep RL) methods have lots of strengths that are favorable if applied to autonomous driving, real deep RL applications in autonomous driving have been slowed down by the modeling gap between the source (training) domain and the target (deployment) domain. Unlike current policy transfer approaches, which generally limit to the usage of uninterpretable neural network representations as the transferred features, we propose to transfer concrete kinematic quantities in autonomous driving. The proposed robust-control-based (RC) generic transfer architecture, which we call RL-RC, incorporates a transferable hierarchical RL trajectory planner and a robust tracking controller based on disturbance observer (DOB). The deep RL policies trained with known nominal dynamics model are transfered directly to the target domain, DOB-based robust tracking control is applied to tackle the modeling gap including the vehicle dynamics errors and the external disturbances such as side forces. We provide simulations validating the capability of the proposed method to achieve zero-shot transfer across multiple driving scenarios such as lane keeping, lane changing and obstacle avoidance.
As a new classification platform, deep learning has recently received increasing attention from researchers and has been successfully applied to many domains. In some domains, like bioinformatics and robotics, it is very difficult to construct a large-scale well-annotated dataset due to the expense of data acquisition and costly annotation, which limits its development. Transfer learning relaxes the hypothesis that the training data must be independent and identically distributed (i.i.d.) with the test data, which motivates us to use transfer learning to solve the problem of insufficient training data. This survey focuses on reviewing the current researches of transfer learning by using deep neural network and its applications. We defined deep transfer learning, category and review the recent research works based on the techniques used in deep transfer learning.
Inferring other agents' mental states such as their knowledge, beliefs and intentions is thought to be essential for effective interactions with other agents. Recently, multiagent systems trained via deep reinforcement learning have been shown to succeed in solving different tasks, but it remains unclear how each agent modeled or represented other agents in their environment. In this work we test whether deep reinforcement learning agents explicitly represent other agents' intentions (their specific aims or goals) during a task in which the agents had to coordinate the covering of different spots in a 2D environment. In particular, we tracked over time the performance of a linear decoder trained to predict the final goal of all agents from the hidden state of each agent's neural network controller. We observed that the hidden layers of agents represented explicit information about other agents' goals, i.e. the target landmark they ended up covering. We also performed a series of experiments, in which some agents were replaced by others with fixed goals, to test the level of generalization of the trained agents. We noticed that during the training phase the agents developed a differential preference for each goal, which hindered generalization. To alleviate the above problem, we propose simple changes to the MADDPG training algorithm which leads to better generalization against unseen agents. We believe that training protocols promoting more active intention reading mechanisms, e.g. by preventing simple symmetry-breaking solutions, is a promising direction towards achieving a more robust generalization in different cooperative and competitive tasks.