Configuration spaces (C-spaces) are an essential component of many robot path-planning algorithms, yet calculating them is a time-consuming task, especially in spaces involving a large number of degrees of freedom (DoF). Here we explore a two-step data-driven approach to C-space approximation: (1) sample (i.e., explicitly calculate) a few configurations; (2) train a machine learning (ML) model on these configurations to predict the collision status of other points in the C-space. We studied multiple factors that impact this approximation process, including model representation, number of DoF (up to 42), collision density, sample size, training set distribution, and desired confidence of predictions. We conclude that XGBoost offers a significant time improvement over other methods, while maintaining low error rates, even in C-Spaces with over 14 DoF.
Deep-learning models for traffic data prediction can have superior performance in modeling complex functions using a multi-layer architecture. However, a major drawback of these approaches is that most of these approaches do not offer forecasts with uncertainty estimates, which are essential for traffic operations and control. Without uncertainty estimates, it is difficult to place any level of trust to the model predictions, and operational strategies relying on overconfident predictions can lead to worsening traffic conditions. In this study, we propose a Bayesian recurrent neural network framework for uncertainty quantification in traffic prediction with higher generalizability by introducing spectral normalization to its hidden layers. In our paper, we have shown that normalization alters the training process of deep neural networks by controlling the model's complexity and reducing the risk of overfitting to the training data. This, in turn, helps improve the generalization performance of the model on out-of-distribution datasets. Results demonstrate that spectral normalization improves uncertainty estimates and significantly outperforms both the layer normalization and model without normalization in single-step prediction horizons. This improved performance can be attributed to the ability of spectral normalization to better localize the feature space of the data under perturbations. Our findings are especially relevant to traffic management applications, where predicting traffic conditions across multiple locations is the goal, but the availability of training data from multiple locations is limited. Spectral normalization, therefore, provides a more generalizable approach that can effectively capture the underlying patterns in traffic data without requiring location-specific models.
Our goal is to develop theory and algorithms for establishing fundamental limits on performance imposed by a robot's sensors for a given task. In order to achieve this, we define a quantity that captures the amount of task-relevant information provided by a sensor. Using a novel version of the generalized Fano inequality from information theory, we demonstrate that this quantity provides an upper bound on the highest achievable expected reward for one-step decision making tasks. We then extend this bound to multi-step problems via a dynamic programming approach. We present algorithms for numerically computing the resulting bounds, and demonstrate our approach on three examples: (i) the lava problem from the literature on partially observable Markov decision processes, (ii) an example with continuous state and observation spaces corresponding to a robot catching a freely-falling object, and (iii) obstacle avoidance using a depth sensor with non-Gaussian noise. We demonstrate the ability of our approach to establish strong limits on achievable performance for these problems by comparing our upper bounds with achievable lower bounds (computed by synthesizing or learning concrete control policies).
In this paper, we provide a theoretical analysis of the recently introduced weakly adversarial networks (WAN) method, used to approximate partial differential equations in high dimensions. We address the existence and stability of the solution, as well as approximation bounds. More precisely, we prove the existence of discrete solutions, intended in a suitable weak sense, for which we prove a quasi-best approximation estimate similar to Cea's lemma, a result commonly found in finite element methods. We also propose two new stabilized WAN-based formulas that avoid the need for direct normalization. Furthermore, we analyze the method's effectiveness for the Dirichlet boundary problem that employs the implicit representation of the geometry. The key requirement for achieving the best approximation outcome is to ensure that the space for the test network satisfies a specific condition, known as the inf-sup condition, essentially requiring that the test network set is sufficiently large when compared to the trial space. The method's accuracy, however, is only determined by the space of the trial network. We also devise a pseudo-time XNODE neural network class for static PDE problems, yielding significantly faster convergence results than the classical DNN network.
The SHAP framework provides a principled method to explain the predictions of a model by computing feature importance. Motivated by applications in finance, we introduce the Top-k Identification Problem (TkIP), where the objective is to identify the k features with the highest SHAP values. While any method to compute SHAP values with uncertainty estimates (such as KernelSHAP and SamplingSHAP) can be trivially adapted to solve TkIP, doing so is highly sample inefficient. The goal of our work is to improve the sample efficiency of existing methods in the context of solving TkIP. Our key insight is that TkIP can be framed as an Explore-m problem--a well-studied problem related to multi-armed bandits (MAB). This connection enables us to improve sample efficiency by leveraging two techniques from the MAB literature: (1) a better stopping-condition (to stop sampling) that identifies when PAC (Probably Approximately Correct) guarantees have been met and (2) a greedy sampling scheme that judiciously allocates samples between different features. By adopting these methods we develop KernelSHAP@k and SamplingSHAP@k to efficiently solve TkIP, offering an average improvement of $5\times$ in sample-efficiency and runtime across most common credit related datasets.
We study the problem of deploying a fleet of mobile robots to service tasks that arrive stochastically over time and at random locations in an environment. This is known as the Dynamic Vehicle Routing Problem (DVRP) and requires robots to allocate incoming tasks among themselves and find an optimal sequence for each robot. State-of-the-art approaches only consider average wait times and focus on high-load scenarios where the arrival rate of tasks approaches the limit of what can be handled by the robots while keeping the queue of unserviced tasks bounded, i.e., stable. To ensure stability, these approaches repeatedly compute minimum distance tours over a set of newly arrived tasks. This paper is aimed at addressing the missing policies for moderate-load scenarios, where quality of service can be improved by prioritizing long-waiting tasks. We introduce a novel DVRP policy based on a cost function that takes the $p$-norm over accumulated wait times and show it guarantees stability even in high-load scenarios. We demonstrate that the proposed policy outperforms the state-of-the-art in both mean and $95^{th}$ percentile wait times in moderate-load scenarios through simulation experiments in the Euclidean plane as well as using real-world data for city scale service requests.
For many statistical experiments, there exists a multitude of optimal designs. If we consider models with uncorrelated observations and adopt the approach of approximate experimental design, the set of all optimal designs typically forms a multivariate polytope. In this paper, we mathematically characterize the polytope of optimal designs. In particular, we show that its vertices correspond to the so-called minimal optimum designs. Consequently, we compute the vertices for several classical multifactor regression models of the first and the second degree. To this end, we use software tools based on rational arithmetic; therefore, the computed list is accurate and complete. The polytope of optimal experimental designs, and its vertices, can be applied in several ways. For instance, it can aid in constructing cost-efficient and efficient exact designs.
The development of new manufacturing techniques such as 3D printing have enabled the creation of previously infeasible chemical reactor designs. Systematically optimizing the highly parameterized geometries involved in these new classes of reactor is vital to ensure enhanced mixing characteristics and feasible manufacturability. Here we present a framework to rapidly solve this nonlinear, computationally expensive, and derivative-free problem, enabling the fast prototype of novel reactor parameterizations. We take advantage of Gaussian processes to adaptively learn a multi-fidelity model of reactor simulations across a number of different continuous mesh fidelities. The search space of reactor geometries is explored through an amalgam of different, potentially lower, fidelity simulations which are chosen for evaluation based on weighted acquisition function, trading off information gain with cost of simulation. Within our framework we derive a novel criteria for monitoring the progress and dictating the termination of multi-fidelity Bayesian optimization, ensuring a high fidelity solution is returned before experimental budget is exhausted. The class of reactor we investigate are helical-tube reactors under pulsed-flow conditions, which have demonstrated outstanding mixing characteristics, have the potential to be highly parameterized, and are easily manufactured using 3D printing. To validate our results, we 3D print and experimentally validate the optimal reactor geometry, confirming its mixing performance. In doing so we demonstrate our design framework to be extensible to a broad variety of expensive simulation-based optimization problems, supporting the design of the next generation of highly parameterized chemical reactors.
Positive linear programs (LPs) model many graph and operations research problems. One can solve for a $(1+\epsilon)$-approximation for positive LPs, for any selected $\epsilon$, in polylogarithmic depth and near-linear work via variations of the multiplicative weight update (MWU) method. Despite extensive theoretical work on these algorithms through the decades, their empirical performance is not well understood. In this work, we implement and test an efficient parallel algorithm for solving positive LP relaxations, and apply it to graph problems such as densest subgraph, bipartite matching, vertex cover and dominating set. We accelerate the algorithm via a new step size search heuristic. Our implementation uses sparse linear algebra optimization techniques such as fusion of vector operations and use of sparse format. Furthermore, we devise an implicit representation for graph incidence constraints. We demonstrate the parallel scalability with the use of threading OpenMP and MPI on the Stampede2 supercomputer. We compare this implementation with exact libraries and specialized libraries for the above problems in order to evaluate MWU's practical standing for both accuracy and performance among other methods. Our results show this implementation is faster than general purpose LP solvers (IBM CPLEX, Gurobi) in all of our experiments, and in some instances, outperforms state-of-the-art specialized parallel graph algorithms.
The evaluation of climate models is a crucial step in climate studies. It consists of quantifying the resemblance of model outputs to reference data to identify models with superior capacity to replicate specific climate variables. Clearly, the choice of the evaluation indicator significantly impacts the results, underscoring the importance of selecting an indicator that properly captures the characteristics of a "good model". This study examines the behavior of six indicators, considering spatial correlation, distribution mean, variance, and shape. A new multi-component measure was selected based on these criteria to assess the performance of 48 CMIP6 models in reproducing the annual seasonal cycle of precipitation, temperature, and teleconnection patterns in Central America. The top six models were determined using multi-criteria methods. It was found that even the best model reproduces one derived climatic variable poorly in this region. The proposed measure and selection method can contribute to enhancing the accuracy of climatological research based on climate models.
Knowledge graph embedding (KGE) is a increasingly popular technique that aims to represent entities and relations of knowledge graphs into low-dimensional semantic spaces for a wide spectrum of applications such as link prediction, knowledge reasoning and knowledge completion. In this paper, we provide a systematic review of existing KGE techniques based on representation spaces. Particularly, we build a fine-grained classification to categorise the models based on three mathematical perspectives of the representation spaces: (1) Algebraic perspective, (2) Geometric perspective, and (3) Analytical perspective. We introduce the rigorous definitions of fundamental mathematical spaces before diving into KGE models and their mathematical properties. We further discuss different KGE methods over the three categories, as well as summarise how spatial advantages work over different embedding needs. By collating the experimental results from downstream tasks, we also explore the advantages of mathematical space in different scenarios and the reasons behind them. We further state some promising research directions from a representation space perspective, with which we hope to inspire researchers to design their KGE models as well as their related applications with more consideration of their mathematical space properties.