This paper presents an integrated system for performing precision harvesting missions using a legged harvester. Our harvester performs a challenging task of autonomous navigation and tree grabbing in a confined, GPS denied forest environment. Strategies for mapping, localization, planning, and control are proposed and integrated into a fully autonomous system. The mission starts with a human mapping the area of interest using a custom-made sensor module. Subsequently, a human expert selects the trees for harvesting. The sensor module is then mounted on the machine and used for localization within the given map. A planning algorithm searches for both an approach pose and a path in a single path planning problem. We design a path following controller leveraging the legged harvester's capabilities for negotiating rough terrain. Upon reaching the approach pose, the machine grabs a tree with a general-purpose gripper. This process repeats for all the trees selected by the operator. Our system has been tested on a testing field with tree trunks and in a natural forest. To the best of our knowledge, this is the first time this level of autonomy has been shown on a full-size hydraulic machine operating in a realistic environment.
Robots are being designed to communicate with people in various public and domestic venues in a helpful, discreet way. Here, we use a speculative approach to shine light on a new concept of robot steganography (RS), that a robot could seek to help vulnerable populations by discreetly warning of potential threats. We first identify some potentially useful scenarios for RS related to safety and security -- concerns that are estimated to cost the world trillions of dollars each year -- with a focus on two kinds of robots, an autonomous vehicle (AV) and a socially assistive humanoid robot (SAR). Next, we propose that existing, powerful, computer-based steganography (CS) approaches can be adopted with little effort in new contexts (SARs), while also pointing out potential benefits of human-like steganography (HS): although less efficient and robust than CS, HS represents a currently-unused form of RS that could also be used to avoid requiring computers or detection by more technically advanced adversaries. This analysis also introduces some unique challenges of RS that arise from message generation, indirect perception, and effects of perspective. For this, we explore some related theoretical and practical concerns for selecting carrier signals and generating messages, also making available some code and a video demo. Finally, we report on checking the current feasibility of the RS concept via a simplified user study, confirming that messages can be hidden in a robot's behaviors. The immediate implication is that RS could help to improve people's lives and mitigate some costly problems -- suggesting the usefulness of further discussion, ideation, and consideration by designers.
This paper aims to improve the path quality and computational efficiency of kinodynamic planners used for vehicular systems. It proposes a learning framework for identifying promising controls during the expansion process of sampling-based motion planners for systems with dynamics. Offline, the learning process is trained to return the highest-quality control that reaches a local goal state (i.e., a waypoint) in the absence of obstacles from an input difference vector between its current state and a local goal state. The data generation scheme provides bounds on the target dispersion and uses state space pruning to ensure high-quality controls. By focusing on the system's dynamics, this process is data efficient and takes place once for a dynamical system, so that it can be used for different environments with modular expansion functions. This work integrates the proposed learning process with a) an exploratory expansion function that generates waypoints with biased coverage over the reachable space, and b) proposes an exploitative expansion function for mobile robots, which generates waypoints using medial axis information. This paper evaluates the learning process and the corresponding planners for a first and second-order differential drive systems. The results show that the proposed integration of learning and planning can produce better quality paths than kinodynamic planning with random controls in fewer iterations and computation time.
The existence of real-world adversarial examples (commonly in the form of patches) poses a serious threat for the use of deep learning models in safety-critical computer vision tasks such as visual perception in autonomous driving. This paper presents an extensive evaluation of the robustness of semantic segmentation models when attacked with different types of adversarial patches, including digital, simulated, and physical ones. A novel loss function is proposed to improve the capabilities of attackers in inducing a misclassification of pixels. Also, a novel attack strategy is presented to improve the Expectation Over Transformation method for placing a patch in the scene. Finally, a state-of-the-art method for detecting adversarial patch is first extended to cope with semantic segmentation models, then improved to obtain real-time performance, and eventually evaluated in real-world scenarios. Experimental results reveal that, even though the adversarial effect is visible with both digital and real-world attacks, its impact is often spatially confined to areas of the image around the patch. This opens to further questions about the spatial robustness of real-time semantic segmentation models.
The manufacturing industry is currently witnessing a paradigm shift with the unprecedented adoption of industrial robots, and machine vision is a key perception technology that enables these robots to perform precise operations in unstructured environments. However, the sensitivity of conventional vision sensors to lighting conditions and high-speed motion sets a limitation on the reliability and work-rate of production lines. Neuromorphic vision is a recent technology with the potential to address the challenges of conventional vision with its high temporal resolution, low latency, and wide dynamic range. In this paper and for the first time, we propose a novel neuromorphic vision based controller for faster and more reliable machining operations, and present a complete robotic system capable of performing drilling tasks with sub-millimeter accuracy. Our proposed system localizes the target workpiece in 3D using two perception stages that we developed specifically for the asynchronous output of neuromorphic cameras. The first stage performs multi-view reconstruction for an initial estimate of the workpiece's pose, and the second stage refines this estimate for a local region of the workpiece using circular hole detection. The robot then precisely positions the drilling end-effector and drills the target holes on the workpiece using a combined position-based and image-based visual servoing approach. The proposed solution is validated experimentally for drilling nutplate holes on workpieces placed arbitrarily in an unstructured environment with uncontrolled lighting. Experimental results prove the effectiveness of our solution with an average positional errors of less than 0.1 mm, and demonstrate that the use of neuromorphic vision overcomes the lighting and speed limitations of conventional cameras.
Several solutions have been proposed in the literature to address the Unmanned Aerial Vehicles (UAVs) collision avoidance problem. Most of these solutions consider that the ground controller system (GCS) determines the path of a UAV before starting a particular mission at hand. Furthermore, these solutions expect the occurrence of collisions based only on the GPS localization of UAVs as well as via object-detecting sensors placed on board UAVs. The sensors' sensitivity to environmental disturbances and the UAVs' influence on their accuracy impact negatively the efficiency of these solutions. In this vein, this paper proposes a new energy and delay-aware physical collision avoidance solution for UAVs. The solution is dubbed EDC-UAV. The primary goal of EDC-UAV is to build inflight safe UAVs trajectories while minimizing the energy consumption and response time. We assume that each UAV is equipped with a global positioning system (GPS) sensor to identify its position. Moreover, we take into account the margin error of the GPS to provide the position of a given UAV. The location of each UAV is gathered by a cluster head, which is the UAV that has either the highest autonomy or the greatest computational capacity. The cluster head runs the EDC-UAV algorithm to control the rest of the UAVs, thus guaranteeing a collision-free mission and minimizing the energy consumption to achieve different purposes. The proper operation of our solution is validated through simulations. The obtained results demonstrate the efficiency of EDC-UAV in achieving its design goals.
Autonomous driving is regarded as one of the most promising remedies to shield human beings from severe crashes. To this end, 3D object detection serves as the core basis of such perception system especially for the sake of path planning, motion prediction, collision avoidance, etc. Generally, stereo or monocular images with corresponding 3D point clouds are already standard layout for 3D object detection, out of which point clouds are increasingly prevalent with accurate depth information being provided. Despite existing efforts, 3D object detection on point clouds is still in its infancy due to high sparseness and irregularity of point clouds by nature, misalignment view between camera view and LiDAR bird's eye of view for modality synergies, occlusions and scale variations at long distances, etc. Recently, profound progress has been made in 3D object detection, with a large body of literature being investigated to address this vision task. As such, we present a comprehensive review of the latest progress in this field covering all the main topics including sensors, fundamentals, and the recent state-of-the-art detection methods with their pros and cons. Furthermore, we introduce metrics and provide quantitative comparisons on popular public datasets. The avenues for future work are going to be judiciously identified after an in-deep analysis of the surveyed works. Finally, we conclude this paper.
This paper presents a comprehensive survey on vision-based robotic grasping. We concluded four key tasks during robotic grasping, which are object localization, pose estimation, grasp detection and motion planning. In detail, object localization includes object detection and segmentation methods, pose estimation includes RGB-based and RGB-D-based methods, grasp detection includes traditional methods and deep learning-based methods, motion planning includes analytical methods, imitating learning methods, and reinforcement learning methods. Besides, lots of methods accomplish some of the tasks jointly, such as object-detection-combined 6D pose estimation, grasp detection without pose estimation, end-to-end grasp detection, and end-to-end motion planning. These methods are reviewed elaborately in this survey. What's more, related datasets are summarized and comparisons between state-of-the-art methods are given for each task. Challenges about robotic grasping are presented, and future directions in addressing these challenges are also pointed out.
Safety and decline of road traffic accidents remain important issues of autonomous driving. Statistics show that unintended lane departure is a leading cause of worldwide motor vehicle collisions, making lane detection the most promising and challenge task for self-driving. Today, numerous groups are combining deep learning techniques with computer vision problems to solve self-driving problems. In this paper, a Global Convolution Networks (GCN) model is used to address both classification and localization issues for semantic segmentation of lane. We are using color-based segmentation is presented and the usability of the model is evaluated. A residual-based boundary refinement and Adam optimization is also used to achieve state-of-art performance. As normal cars could not afford GPUs on the car, and training session for a particular road could be shared by several cars. We propose a framework to get it work in real world. We build a real time video transfer system to get video from the car, get the model trained in edge server (which is equipped with GPUs), and send the trained model back to the car.
3D vehicle detection and tracking from a monocular camera requires detecting and associating vehicles, and estimating their locations and extents together. It is challenging because vehicles are in constant motion and it is practically impossible to recover the 3D positions from a single image. In this paper, we propose a novel framework that jointly detects and tracks 3D vehicle bounding boxes. Our approach leverages 3D pose estimation to learn 2D patch association overtime and uses temporal information from tracking to obtain stable 3D estimation. Our method also leverages 3D box depth ordering and motion to link together the tracks of occluded objects. We train our system on realistic 3D virtual environments, collecting a new diverse, large-scale and densely annotated dataset with accurate 3D trajectory annotations. Our experiments demonstrate that our method benefits from inferring 3D for both data association and tracking robustness, leveraging our dynamic 3D tracking dataset.
Querying graph structured data is a fundamental operation that enables important applications including knowledge graph search, social network analysis, and cyber-network security. However, the growing size of real-world data graphs poses severe challenges for graph databases to meet the response-time requirements of the applications. Planning the computational steps of query processing - Query Planning - is central to address these challenges. In this paper, we study the problem of learning to speedup query planning in graph databases towards the goal of improving the computational-efficiency of query processing via training queries.We present a Learning to Plan (L2P) framework that is applicable to a large class of query reasoners that follow the Threshold Algorithm (TA) approach. First, we define a generic search space over candidate query plans, and identify target search trajectories (query plans) corresponding to the training queries by performing an expensive search. Subsequently, we learn greedy search control knowledge to imitate the search behavior of the target query plans. We provide a concrete instantiation of our L2P framework for STAR, a state-of-the-art graph query reasoner. Our experiments on benchmark knowledge graphs including DBpedia, YAGO, and Freebase show that using the query plans generated by the learned search control knowledge, we can significantly improve the speed of STAR with negligible loss in accuracy.