This work presents two novel approaches for the symplectic model reduction of high-dimensional Hamiltonian systems using data-driven quadratic manifolds. Classical symplectic model reduction approaches employ linear symplectic subspaces for representing the high-dimensional system states in a reduced-dimensional coordinate system. While these approximations respect the symplectic nature of Hamiltonian systems, the linearity of the approximation imposes a fundamental limitation to the accuracy that can be achieved. We propose two different model reduction methods based on recently developed quadratic manifolds, each presenting its own advantages and limitations. The addition of quadratic terms in the state approximation, which sits at the heart of the proposed methodologies, enables us to better represent intrinsic low-dimensionality in the problem at hand. Both approaches are effective for issuing predictions in settings well outside the range of their training data while providing more accurate solutions than the linear symplectic reduced-order models.
In many industrial applications, obtaining labeled observations is not straightforward as it often requires the intervention of human experts or the use of expensive testing equipment. In these circumstances, active learning can be highly beneficial in suggesting the most informative data points to be used when fitting a model. Reducing the number of observations needed for model development alleviates both the computational burden required for training and the operational expenses related to labeling. Online active learning, in particular, is useful in high-volume production processes where the decision about the acquisition of the label for a data point needs to be taken within an extremely short time frame. However, despite the recent efforts to develop online active learning strategies, the behavior of these methods in the presence of outliers has not been thoroughly examined. In this work, we investigate the performance of online active linear regression in contaminated data streams. Our study shows that the currently available query strategies are prone to sample outliers, whose inclusion in the training set eventually degrades the predictive performance of the models. To address this issue, we propose a solution that bounds the search area of a conditional D-optimal algorithm and uses a robust estimator. Our approach strikes a balance between exploring unseen regions of the input space and protecting against outliers. Through numerical simulations, we show that the proposed method is effective in improving the performance of online active learning in the presence of outliers, thus expanding the potential applications of this powerful tool.
Decentralized learning algorithms are an essential tool for designing multi-agent systems, as they enable agents to autonomously learn from their experience and past interactions. In this work, we propose a theoretical and algorithmic framework for real-time identification of the learning dynamics that govern agent behavior using a short burst of a single system trajectory. Our method identifies agent dynamics through polynomial regression, where we compensate for limited data by incorporating side-information constraints that capture fundamental assumptions or expectations about agent behavior. These constraints are enforced computationally using sum-of-squares optimization, leading to a hierarchy of increasingly better approximations of the true agent dynamics. Extensive experiments demonstrated that our approach, using only 5 samples from a short run of a single trajectory, accurately recovers the true dynamics across various benchmarks, including equilibrium selection and prediction of chaotic systems up to 10 Lyapunov times. These findings suggest that our approach has significant potential to support effective policy and decision-making in strategic multi-agent systems.
Autonomous vehicles require accurate and reliable short-term trajectory predictions for safe and efficient driving. While most commercial automated vehicles currently use state machine-based algorithms for trajectory forecasting, recent efforts have focused on end-to-end data-driven systems. Often, the design of these models is limited by the availability of datasets, which are typically restricted to generic scenarios. To address this limitation, we have developed a synthetic dataset for short-term trajectory prediction tasks using the CARLA simulator. This dataset is extensive and incorporates what is considered complex scenarios - pedestrians crossing the road, vehicles overtaking - and comprises 6000 perspective view images with corresponding IMU and odometry information for each frame. Furthermore, an end-to-end short-term trajectory prediction model using convolutional neural networks (CNN) and long short-term memory (LSTM) networks has also been developed. This model can handle corner cases, such as slowing down near zebra crossings and stopping when pedestrians cross the road, without the need for explicit encoding of the surrounding environment. In an effort to accelerate this research and assist others, we are releasing our dataset and model to the research community. Our datasets are publicly available on //github.com/sharmasushil/Navigating-Uncertainty-Trajectory-Prediction .
Uncertainty quantification is an essential task in machine learning - a task in which neural networks (NNs) have traditionally not excelled. This can be a limitation for safety-critical applications, where uncertainty-aware methods like Gaussian processes or Bayesian linear regression are often preferred. Bayesian neural networks are an approach to address this limitation. They assume probability distributions for all parameters and yield distributed predictions. However, training and inference are typically intractable and approximations must be employed. A promising approximation is NNs with Bayesian last layer (BLL). They assume distributed weights only in the last linear layer and yield a normally distributed prediction. NNs with BLL can be seen as a Bayesian linear regression model with learned nonlinear features. To approximate the intractable Bayesian neural network, point estimates of the distributed weights in all but the last layer should be obtained by maximizing the marginal likelihood. This has previously been challenging, as the marginal likelihood is expensive to evaluate in this setting and prohibits direct training through backpropagation. We present a reformulation of the log-marginal likelihood of a NN with BLL which allows for efficient training using backpropagation. Furthermore, we address the challenge of quantifying uncertainty for extrapolation points. We provide a metric to quantify the degree of extrapolation and derive a method to improve the uncertainty quantification for these points. Our methods are derived for the multivariate case and demonstrated in a simulation study, where we compare Bayesian linear regression applied to a previously trained neural network with our proposed algorithm
Achieving fairness in sequential-decision making systems within Human-in-the-Loop (HITL) environments is a critical concern, especially when multiple humans with different behavior and expectations are affected by the same adaptation decisions in the system. This human variability factor adds more complexity since policies deemed fair at one point in time may become discriminatory over time due to variations in human preferences resulting from inter- and intra-human variability. This paper addresses the fairness problem from an equity lens, considering human behavior variability, and the changes in human preferences over time. We propose FAIRO, a novel algorithm for fairness-aware sequential-decision making in HITL adaptation, which incorporates these notions into the decision-making process. In particular, FAIRO decomposes this complex fairness task into adaptive sub-tasks based on individual human preferences through leveraging the Options reinforcement learning framework. We design FAIRO to generalize to three types of HITL application setups that have the shared adaptation decision problem. Furthermore, we recognize that fairness-aware policies can sometimes conflict with the application's utility. To address this challenge, we provide a fairness-utility tradeoff in FAIRO, allowing system designers to balance the objectives of fairness and utility based on specific application requirements. Extensive evaluations of FAIRO on the three HITL applications demonstrate its generalizability and effectiveness in promoting fairness while accounting for human variability. On average, FAIRO can improve fairness compared with other methods across all three applications by 35.36%.
The state of the art related to parameter correlation in two-parameter models has been reviewed in this paper. The apparent contradictions between the different authors regarding the ability of D--optimality to simultaneously reduce the correlation and the area of the confidence ellipse in two-parameter models were analyzed. Two main approaches were found: 1) those who consider that the optimality criteria simultaneously control the precision and correlation of the parameter estimators; and 2) those that consider a combination of criteria to achieve the same objective. An analytical criterion combining in its structure both the optimality of the precision of the estimators of the parameters and the reduction of the correlation between their estimators is provided. The criterion was tested both in a simple linear regression model, considering all possible design spaces, and in a non-linear model with strong correlation of the estimators of the parameters (Michaelis--Menten) to show its performance. This criterion showed a superior behavior to all the strategies and criteria to control at the same time the precision and the correlation.
Providing a model that achieves a strong predictive performance and at the same time is interpretable by humans is one of the most difficult challenges in machine learning research due to the conflicting nature of these two objectives. To address this challenge, we propose a modification of the Radial Basis Function Neural Network model by equipping its Gaussian kernel with a learnable precision matrix. We show that precious information is contained in the spectrum of the precision matrix that can be extracted once the training of the model is completed. In particular, the eigenvectors explain the directions of maximum sensitivity of the model revealing the active subspace and suggesting potential applications for supervised dimensionality reduction. At the same time, the eigenvectors highlight the relationship in terms of absolute variation between the input and the latent variables, thereby allowing us to extract a ranking of the input variables based on their importance to the prediction task enhancing the model interpretability. We conducted numerical experiments for regression, classification, and feature selection tasks, comparing our model against popular machine learning models and the state-of-the-art deep learning-based embedding feature selection techniques. Our results demonstrate that the proposed model does not only yield an attractive prediction performance with respect to the competitors but also provides meaningful and interpretable results that potentially could assist the decision-making process in real-world applications. A PyTorch implementation of the model is available on GitHub at the following link. //github.com/dannyzx/GRBF-NNs
Many computational problems involve optimization over discrete variables with quadratic interactions. Known as discrete quadratic models (DQMs), these problems in general are NP-hard. Accordingly, there is increasing interest in encoding DQMs as quadratic unconstrained binary optimization (QUBO) models to allow their solution by quantum and quantum-inspired hardware with architectures and solution methods designed specifically for such problem types. However, converting DQMs to QUBO models often introduces invalid solutions to the solution space of the QUBO models. These solutions must be penalized by introducing appropriate constraints to the QUBO objective function that are weighted by a tunable penalty parameter to ensure that the global optimum is valid. However, selecting the strength of this parameter is non-trivial, given its influence on solution landscape structure. Here, we investigate the effects of choice of encoding and penalty strength on the structure of QUBO DQM solution landscapes and their optimization, focusing specifically on one-hot and domain-wall encodings.
Robotic manipulation of slender objects is challenging, especially when the induced deformations are large and nonlinear. Traditionally, learning-based control approaches, such as imitation learning, have been used to address deformable material manipulation. These approaches lack generality and often suffer critical failure from a simple switch of material, geometric, and/or environmental (e.g., friction) properties. This article tackles a fundamental but difficult deformable manipulation task: forming a predefined fold in paper with only a single manipulator. A data-driven framework combining physically-accurate simulation and machine learning is used to train a deep neural network capable of predicting the external forces induced on the manipulated paper given a grasp position. We frame the problem using scaling analysis, resulting in a control framework robust against material and geometric changes. Path planning is then carried out over the generated "neural force manifold" to produce robot manipulation trajectories optimized to prevent sliding, with offline trajectory generation finishing 15$\times$ faster than previous physics-based folding methods. The inference speed of the trained model enables the incorporation of real-time visual feedback to achieve closed-loop sensorimotor control. Real-world experiments demonstrate that our framework can greatly improve robotic manipulation performance compared to state-of-the-art folding strategies, even when manipulating paper objects of various materials and shapes.
We analyze the convergence properties of Fermat distances, a family of density-driven metrics defined on Riemannian manifolds with an associated probability measure. Fermat distances may be defined either on discrete samples from the underlying measure, in which case they are random, or in the continuum setting, in which they are induced by geodesics under a density-distorted Riemannian metric. We prove that discrete, sample-based Fermat distances converge to their continuum analogues in small neighborhoods with a precise rate that depends on the intrinsic dimensionality of the data and the parameter governing the extent of density weighting in Fermat distances. This is done by leveraging novel geometric and statistical arguments in percolation theory that allow for non-uniform densities and curved domains. Our results are then used to prove that discrete graph Laplacians based on discrete, sample-driven Fermat distances converge to corresponding continuum operators. In particular, we show the discrete eigenvalues and eigenvectors converge to their continuum analogues at a dimension-dependent rate, which allows us to interpret the efficacy of discrete spectral clustering using Fermat distances in terms of the resulting continuum limit. The perspective afforded by our discrete-to-continuum Fermat distance analysis leads to new clustering algorithms for data and related insights into efficient computations associated to density-driven spectral clustering. Our theoretical analysis is supported with numerical simulations and experiments on synthetic and real image data.