With the growth of cars and car-sharing applications, commuters in many cities, particularly developing countries, are shifting away from public transport. These shifts have affected two key stakeholders: transit operators and first- and last-mile (FLM) services. Although most cities continue to invest heavily in bus and metro projects to make public transit attractive, ridership in these systems has often failed to reach targeted levels. FLM service providers also experience lower demand and revenues in the wake of shifts to other means of transport. Effective FLM options are required to prevent this phenomenon and make public transport attractive for commuters. One possible solution is to forge partnerships between public transport and FLM providers that offer competitive joint mobility options. Such solutions require prudent allocation of supply and optimised strategies for FLM operations and ride-sharing. To this end, we build an agent- and event-based simulation model which captures interactions between passengers and FLM services using statecharts, vehicle routing models, and other trip matching rules. An optimisation model for allocating FLM vehicles at different transit stations is proposed to reduce unserved requests. Using real-world metro transit demand data from Bengaluru, India, the effectiveness of our approach in improving FLM connectivity and quantifying the benefits of sharing trips is demonstrated.
Virtual reality (VR) is known to cause a "time compression" effect, where the time spent in VR feels to pass faster than the effective elapsed time. Our goal with this research is to investigate if the physical realism of a VR experience reduces the time compression effect on a gas monitoring training task that requires precise time estimation. We used physical props and passive haptics in a VR task with high physical realism and compared it to an equivalent standard VR task with only virtual objects. We also used an identical real-world task as a baseline time estimation task. Each scenario includes the user picking up a device, opening a door, navigating a corridor with obstacles, performing five short time estimations, and estimating the total time from task start to end. Contrary to previous work, there was a consistent time dilation effect in all conditions, including the real world. However, no significant effects were found comparing the estimated differences between the high and low physical realism conditions. We discuss implications of the results and limitations of the study and propose future work that may better address this important question for virtual reality training.
Each step that results in a bit of information being ``forgotten'' by a computing device has an intrinsic energy cost. Although any Turing machine can be rewritten to be thermodynamically reversible without changing the recognized language, finite automata that are restricted to scan their input once in ``real-time'' fashion can only recognize the members of a proper subset of the class of regular languages in this reversible manner. We study the energy expenditure associated with the computations of deterministic and quantum finite automata. We prove that zero-error quantum finite automata have no advantage over their classical deterministic counterparts in terms of the maximum obligatory thermodynamic cost associated by any step during the recognition of different regular languages. We also demonstrate languages for which ``error can be traded for energy'', i.e. whose zero-error recognition is associated with computation steps having provably bigger obligatory energy cost when compared to their bounded-error recognition by real-time finite-memory quantum devices. We show that regular languages can be classified according to the intrinsic energy requirements on the recognizing automaton as a function of input length, and prove upper and lower bounds.
We present a method for solving the coverage problem with the objective of autonomously exploring an unknown environment under mission time constraints. Here, the robot is tasked with planning a path over a horizon such that the accumulated area swept out by its sensor footprint is maximized. Because this problem exhibits a diminishing returns property known as submodularity, we choose to formulate it as a tree-based sequential decision making process. This formulation allows us to evaluate the effects of the robot's actions on future world coverage states, while simultaneously accounting for traversability risk and the dynamic constraints of the robot. To quickly find near-optimal solutions, we propose an effective approximation to the coverage sensor model which adapts to the local environment. Our method was extensively tested across various complex environments and served as the local exploration algorithm for a competing entry in the DARPA Subterranean Challenge.
This paper presents an approach to ensure conditions on Variable Impedance Controllers through the off-line tuning of the parameters involved in its description. In particular, we prove its application to term modulations defined by a Learning from Demonstration technique. This is performed through the assessment of conditions regarding safety and performance, which encompass heuristics and constraints in the form of Linear Matrix Inequalities. Latter ones allow to define a convex optimisation problem to analyse their fulfilment, and require a polytopic description of the VIC, in this case, obtained from its formulation as a discrete-time Linear Parameter Varying system. With respect to the current state-of-art, this approach only limits the term definition obtained by the Learning from Demonstration technique to be continuous and function of exogenous signals, i.e. external variables to the robot. Therefore, using a solution-search method, the most suitable set of parameters according to assessment criteria can be obtained. Using a 7-DoF Kinova Gen3 manipulator, validation and comparison against solutions with relaxed conditions are performed. The method is applied to generate Variable Impedance Controllers for a pulley belt looping task, inspired by the Assembly Challenge for Industrial Robotics in World Robot Summit 2018, to reduce the exerted force with respect to a standard (constant) Impedance Controller. Additionally, method agility is evaluated on the generation of controllers for one-off modifications of the nominal belt looping task setup without new demonstrations.
The integration of discrete algorithmic components in deep learning architectures has numerous applications. Recently, Implicit Maximum Likelihood Estimation (IMLE, Niepert, Minervini, and Franceschi 2021), a class of gradient estimators for discrete exponential family distributions, was proposed by combining implicit differentiation through perturbation with the path-wise gradient estimator. However, due to the finite difference approximation of the gradients, it is especially sensitive to the choice of the finite difference step size, which needs to be specified by the user. In this work, we present Adaptive IMLE (AIMLE), the first adaptive gradient estimator for complex discrete distributions: it adaptively identifies the target distribution for IMLE by trading off the density of gradient information with the degree of bias in the gradient estimates. We empirically evaluate our estimator on synthetic examples, as well as on Learning to Explain, Discrete Variational Auto-Encoders, and Neural Relational Inference tasks. In our experiments, we show that our adaptive gradient estimator can produce faithful estimates while requiring orders of magnitude fewer samples than other gradient estimators.
Achieving highly accurate dynamic or simulator models that are close to the real robot can facilitate model-based controls (e.g., model predictive control or linear-quadradic regulators), model-based trajectory planning (e.g., trajectory optimization), and decrease the amount of learning time necessary for reinforcement learning methods. Thus, the objective of this work is to learn the residual errors between a dynamic and/or simulator model and the real robot. This is achieved using a neural network, where the parameters of a neural network are updated through an Unscented Kalman Filter (UKF) formulation. Using this method, we model these residual errors with only small amounts of data -- a necessity as we improve the simulator/dynamic model by learning directly from real-world operation. We demonstrate our method on robotic hardware (e.g., manipulator arm, and a wheeled robot), and show that with the learned residual errors, we can further close the reality gap between dynamic models, simulations, and actual hardware.
We describe a plausible probabilistic model for a blockchain queueing environment in which rational, profit-maximising schedulers impose adversarial disciplines on incoming messages containing a payload that encodes a state transition in a machine. The model can be specialised to apply to chains with fixed or variable block times, traditional priority queue disciplines with `honest' schedulers, or adversarial public mempools. We find conditions under which the model behaves as a bulk-service queue with priority discipline and derive practical expressions for the relative block and message number of a transaction. We study this setup in the context of orders to a CFMM DEX where the execution price a user receives may be quite sensitive to its positioning in the chain -- in particular, to a string of transactions scheduled for prior execution which is not knowable at the time of order creation. We derive statistical models for the price impact of this order flow both in the presence and absence of MEV extraction activity.
In this work, we propose a stateless blockchain called CompactChain, which compacts the entire state of the UTXO (Unspent Transaction Output) based blockchain systems into two RSA accumulators. The first accumulator is called Transaction Output (TXO) commitment which represents the TXO set. The second one is called Spent Transaction Output (STXO) commitment which represents the STXO set. In this work, we discuss three algorithms - (i) To update the TXO and STXO commitments by the miner. The miner also provides the proofs for the correctness of the updated commitments; (ii) To prove the transaction's validity by providing a membership witness in TXO commitment and non-membership witness against STXO commitment for a coin being spent by a user; (iii) To update the witness for the coin that is not yet spent; The experimental results evaluate the performance of the CompactChain in terms of time taken by a miner to update the commitments and time taken by a validator to verify the commitments and validate the transactions. We compare the performance of CompactChain with the existing state-of-art works on stateless blockchains. CompactChain shows a reduction in commitments update complexity and transaction witness size which inturn reduces the mempool size and propagation latency without compromising the system throughput (Transactions per second (TPS)).
This study explores the problem of Multi-Agent Path Finding with continuous and stochastic travel times whose probability distribution is unknown. Our purpose is to manage a group of automated robots that provide package delivery services in a building where pedestrians and a wide variety of robots coexist, such as delivery services in office buildings, hospitals, and apartments. It is often the case with these real-world applications that the time required for the robots to traverse a corridor takes a continuous value and is randomly distributed, and the prior knowledge of the probability distribution of the travel time is limited. Multi-Agent Path Finding has been widely studied and applied to robot management systems; however, automating the robot operation in such environments remains difficult. We propose 1) online re-planning to update the action plan of robots while it is executed, and 2) parameter update to estimate the probability distribution of travel time using Bayesian inference as the delay is observed. We use a greedy heuristic to obtain solutions in a limited computation time. Through simulations, we empirically compare the performance of our method to those of existing methods in terms of the conflict probability and the actual travel time of robots. The simulation results indicate that the proposed method can find travel paths with at least 50% fewer conflicts and a shorter actual total travel time than existing methods. The proposed method requires a small number of trials to achieve the performance because the parameter update is prioritized on the important edges for path planning, thereby satisfying the requirements of quick implementation of robust planning of automated delivery services.
In large-scale systems there are fundamental challenges when centralised techniques are used for task allocation. The number of interactions is limited by resource constraints such as on computation, storage, and network communication. We can increase scalability by implementing the system as a distributed task-allocation system, sharing tasks across many agents. However, this also increases the resource cost of communications and synchronisation, and is difficult to scale. In this paper we present four algorithms to solve these problems. The combination of these algorithms enable each agent to improve their task allocation strategy through reinforcement learning, while changing how much they explore the system in response to how optimal they believe their current strategy is, given their past experience. We focus on distributed agent systems where the agents' behaviours are constrained by resource usage limits, limiting agents to local rather than system-wide knowledge. We evaluate these algorithms in a simulated environment where agents are given a task composed of multiple subtasks that must be allocated to other agents with differing capabilities, to then carry out those tasks. We also simulate real-life system effects such as networking instability. Our solution is shown to solve the task allocation problem to 6.7% of the theoretical optimal within the system configurations considered. It provides 5x better performance recovery over no-knowledge retention approaches when system connectivity is impacted, and is tested against systems up to 100 agents with less than a 9% impact on the algorithms' performance.