Motion planning methods like navigation functions and harmonic potential fields provide (almost) global convergence and are suitable for obstacle avoidance in dynamically changing environments due to their reactive nature. A common assumption in the control design is that the robot operates in a disjoint star world, i.e. all obstacles are strictly starshaped and mutually disjoint. However, in real-life scenarios obstacles may intersect due to expanded obstacle regions corresponding to robot radius or safety margins. To broaden the applicability of aforementioned reactive motion planning methods, we propose a method to reshape a workspace of intersecting obstacles into a disjoint star world. The algorithm is based on two novel concepts presented here, namely admissible kernel and starshaped hull with specified kernel, which are closely related to the notion of starshaped hull. The utilization of the proposed method is illustrated with examples of a robot operating in a 2D workspace using a harmonic potential field approach in combination with the developed algorithm.
The Robot Operating System 2 (ROS 2) is the second generation of ROS representing a step forward in the robotic framework. Several new types of nodes and executor models are integral to control where, how, and when information is processed in the computational graph. This paper explores and benchmarks one of these new node types -- the Component node -- which allows nodes to be composed manually or dynamically into processes while retaining separation of concerns in a codebase for distributed development. Composition is shown to achieve a high degree of performance optimization, particularly valuable for resource-constrained systems and sensor processing pipelines, enabling distributed tasks that would not be otherwise possible in ROS 2. In this work, we briefly introduce the significance and design of node composition, then our contribution of benchmarking is provided to analyze its impact on robotic systems. Its compelling influence on performance is shown through several experiments on the latest Long Term Support (LTS) ROS 2 distribution, Humble Hawksbill.
We introduce RAMP, an open-source robotics benchmark inspired by real-world industrial assembly tasks. RAMP consists of beams that a robot must assemble into specified goal configurations using pegs as fasteners. As such it assesses planning and execution capabilities, and poses challenges in perception, reasoning, manipulation, diagnostics, fault recovery and goal parsing. RAMP has been designed to be accessible and extensible. Parts are either 3D printed or otherwise constructed from materials that are readily obtainable. The part design and detailed instructions are publicly available. In order to broaden community engagement, RAMP incorporates fixtures such as April Tags which enable researchers to focus on individual sub-tasks of the assembly challenge if desired. We provide a full digital twin as well as rudimentary baselines to enable rapid progress. Our vision is for RAMP to form the substrate for a community-driven endeavour that evolves as capability matures.
Understanding the causal relationships among the variables of a system is paramount to explain and control its behaviour. Inferring the causal graph from observational data without interventions, however, requires a lot of strong assumptions that are not always realistic. Even for domain experts it can be challenging to express the causal graph. Therefore, metrics that quantitatively assess the goodness of a causal graph provide helpful checks before using it in downstream tasks. Existing metrics provide an absolute number of inconsistencies between the graph and the observed data, and without a baseline, practitioners are left to answer the hard question of how many such inconsistencies are acceptable or expected. Here, we propose a novel consistency metric by constructing a surrogate baseline through node permutations. By comparing the number of inconsistencies with those on the surrogate baseline, we derive an interpretable metric that captures whether the DAG fits significantly better than random. Evaluating on both simulated and real data sets from various domains, including biology and cloud monitoring, we demonstrate that the true DAG is not falsified by our metric, whereas the wrong graphs given by a hypothetical user are likely to be falsified.
Electric Vehicle (EV) has become one of the promising solutions to the ever-evolving environmental and energy crisis. The key to the wide adoption of EVs is a pervasive charging infrastructure, composed of both private/home chargers and public/commercial charging stations. The security of EV charging, however, has not been thoroughly investigated. This paper investigates the communication mechanisms between the chargers and EVs, and exposes the lack of protection on the authenticity in the SAE J1772 charging control protocol. To showcase our discoveries, we propose a new class of attacks, ChargeX, which aims to manipulate the charging states or charging rates of EV chargers with the goal of disrupting the charging schedules, causing a denial of service (DoS), or degrading the battery performance. ChargeX inserts a hardware attack circuit to strategically modify the charging control signals. We design and implement multiple attack systems, and evaluate the attacks on a public charging station and two home chargers using a simulated vehicle load in the lab environment. Extensive experiments on different types of chargers demonstrate the effectiveness and generalization of ChargeX. Specifically, we demonstrate that ChargeX can force the switching of an EV's charging state from ``stand by" to ``charging", even when the vehicle is not in the charging state. We further validate the attacks on a Tesla Model 3 vehicle to demonstrate the disruptive impacts of ChargeX. If deployed, ChargeX may significantly demolish people's trust in the EV charging infrastructure.
Agile maneuvers are essential for robot-enabled complex tasks such as surgical procedures. Prior explorations on surgery autonomy are limited to feasibility study of completing a single task without systematically addressing generic manipulation safety across different tasks. We present an integrated planning and control framework for 6-DoF robotic instruments for pipeline automation of surgical tasks.We leverage the geometry of a robotic instrument and propose the nodal state space (NSS) to represent the robot state in SE(3) space. Each elementary robot motion could be encoded by regulation of the state parameters via a dynamical system. This theoretically ensures that every in-process trajectory is globally feasible and stably reached to an admissible target, and the controller is of closed-form without computing 6-DoF inverse kinematics. Then, to plan the motion steps reliably, we propose an interactive (instant) goal state of the robot that transforms manipulation planning through desired path constraints into a goal-varying manipulation (GVM) problem. We detail how GVM could adaptively and smoothly plan the procedure (could proceed or rewind the process as needed) based on on-the-fly situations under dynamic or disturbed environment. Finally, we extend the above policy to characterize complete pipelines of various surgical tasks. Simulations show that our framework could smoothly solve twisted maneuvers while avoiding collisions. Physical experiments using the da Vinci Research Kit (dVRK) validates the capability of automating individual tasks including tissue debridement, dissection, and wound suturing. The results confirm good task-level consistency and reliability compared to state-of-the-art automation algorithms.
Although Connected Vehicles (CVs) have demonstrated tremendous potential to enhance traffic operations, they can impose privacy risks on individual travelers, e.g., leaking sensitive information about their frequently visited places, routing behavior, etc. Despite the large body of literature that devises various algorithms to exploit CV information, research on privacy-preserving traffic control is still in its infancy. In this paper, we aim to fill this research gap and propose a privacy-preserving adaptive traffic signal control method using CV data. Specifically, we leverage secure Multi-Party Computation and differential privacy to devise a privacy-preserving CV data aggregation mechanism, which can calculate key traffic quantities without any CVs having to reveal their private data. We further develop a linear optimization model for adaptive signal control based on the traffic variables obtained via the data aggregation mechanism. The proposed linear programming problem is further extended to a stochastic programming problem to explicitly handle the noises added by the differentially private mechanism. Evaluation results show that the linear optimization model preserves privacy with a marginal impact on control performance, and the stochastic programming model can significantly reduce residual queues compared to the linear programming model, with almost no increase in vehicle delay. Overall, our methods demonstrate the feasibility of incorporating privacy-preserving mechanisms in CV-based traffic modeling and control, which guarantees both utility and privacy.
In large-scale systems there are fundamental challenges when centralised techniques are used for task allocation. The number of interactions is limited by resource constraints such as on computation, storage, and network communication. We can increase scalability by implementing the system as a distributed task-allocation system, sharing tasks across many agents. However, this also increases the resource cost of communications and synchronisation, and is difficult to scale. In this paper we present four algorithms to solve these problems. The combination of these algorithms enable each agent to improve their task allocation strategy through reinforcement learning, while changing how much they explore the system in response to how optimal they believe their current strategy is, given their past experience. We focus on distributed agent systems where the agents' behaviours are constrained by resource usage limits, limiting agents to local rather than system-wide knowledge. We evaluate these algorithms in a simulated environment where agents are given a task composed of multiple subtasks that must be allocated to other agents with differing capabilities, to then carry out those tasks. We also simulate real-life system effects such as networking instability. Our solution is shown to solve the task allocation problem to 6.7% of the theoretical optimal within the system configurations considered. It provides 5x better performance recovery over no-knowledge retention approaches when system connectivity is impacted, and is tested against systems up to 100 agents with less than a 9% impact on the algorithms' performance.
Effective multi-robot teams require the ability to move to goals in complex environments in order to address real-world applications such as search and rescue. Multi-robot teams should be able to operate in a completely decentralized manner, with individual robot team members being capable of acting without explicit communication between neighbors. In this paper, we propose a novel game theoretic model that enables decentralized and communication-free navigation to a goal position. Robots each play their own distributed game by estimating the behavior of their local teammates in order to identify behaviors that move them in the direction of the goal, while also avoiding obstacles and maintaining team cohesion without collisions. We prove theoretically that generated actions approach a Nash equilibrium, which also corresponds to an optimal strategy identified for each robot. We show through extensive simulations that our approach enables decentralized and communication-free navigation by a multi-robot system to a goal position, and is able to avoid obstacles and collisions, maintain connectivity, and respond robustly to sensor noise.
Breakthroughs in machine learning in the last decade have led to `digital intelligence', i.e. machine learning models capable of learning from vast amounts of labeled data to perform several digital tasks such as speech recognition, face recognition, machine translation and so on. The goal of this thesis is to make progress towards designing algorithms capable of `physical intelligence', i.e. building intelligent autonomous navigation agents capable of learning to perform complex navigation tasks in the physical world involving visual perception, natural language understanding, reasoning, planning, and sequential decision making. Despite several advances in classical navigation methods in the last few decades, current navigation agents struggle at long-term semantic navigation tasks. In the first part of the thesis, we discuss our work on short-term navigation using end-to-end reinforcement learning to tackle challenges such as obstacle avoidance, semantic perception, language grounding, and reasoning. In the second part, we present a new class of navigation methods based on modular learning and structured explicit map representations, which leverage the strengths of both classical and end-to-end learning methods, to tackle long-term navigation tasks. We show that these methods are able to effectively tackle challenges such as localization, mapping, long-term planning, exploration and learning semantic priors. These modular learning methods are capable of long-term spatial and semantic understanding and achieve state-of-the-art results on various navigation tasks.
Search in social networks such as Facebook poses different challenges than in classical web search: besides the query text, it is important to take into account the searcher's context to provide relevant results. Their social graph is an integral part of this context and is a unique aspect of Facebook search. While embedding-based retrieval (EBR) has been applied in eb search engines for years, Facebook search was still mainly based on a Boolean matching model. In this paper, we discuss the techniques for applying EBR to a Facebook Search system. We introduce the unified embedding framework developed to model semantic embeddings for personalized search, and the system to serve embedding-based retrieval in a typical search system based on an inverted index. We discuss various tricks and experiences on end-to-end optimization of the whole system, including ANN parameter tuning and full-stack optimization. Finally, we present our progress on two selected advanced topics about modeling. We evaluated EBR on verticals for Facebook Search with significant metrics gains observed in online A/B experiments. We believe this paper will provide useful insights and experiences to help people on developing embedding-based retrieval systems in search engines.