This paper investigates the applicability of a recently-proposed nonlinear sparse Bayesian learning (NSBL) algorithm to identify and estimate the complex aerodynamics of limit cycle oscillations. NSBL provides a semi-analytical framework for determining the data-optimal sparse model nested within a (potentially) over-parameterized model. This is particularly relevant to nonlinear dynamical systems where modelling approaches involve the use of physics-based and data-driven components. In such cases, the data-driven components, where analytical descriptions of the physical processes are not readily available, are often prone to overfitting, meaning that the empirical aspects of these models will often involve the calibration of an unnecessarily large number of parameters. While it may be possible to fit the data well, this can become an issue when using these models for predictions in regimes that are different from those where the data was recorded. In view of this, it is desirable to not only calibrate the model parameters, but also to identify the optimal compromise between data-fit and model complexity. In this paper, this is achieved for an aeroelastic system where the structural dynamics are well-known and described by a differential equation model, coupled with a semi-empirical aerodynamic model for laminar separation flutter resulting in low-amplitude limit cycle oscillations. For the purpose of illustrating the benefit of the algorithm, in this paper, we use synthetic data to demonstrate the ability of the algorithm to correctly identify the optimal model and model parameters, given a known data-generating model. The synthetic data are generated from a forward simulation of a known differential equation model with parameters selected so as to mimic the dynamics observed in wind-tunnel experiments.
Shapley values are ubiquitous in interpretable Machine Learning due to their strong theoretical background and efficient implementation in the SHAP library. Computing these values previously induced an exponential cost with respect to the number of input features of an opaque model. Now, with efficient implementations such as Interventional TreeSHAP, this exponential burden is alleviated assuming one is explaining ensembles of decision trees. Although Interventional TreeSHAP has risen in popularity, it still lacks a formal proof of how/why it works. We provide such proof with the aim of not only increasing the transparency of the algorithm but also to encourage further development of these ideas. Notably, our proof for Interventional TreeSHAP is easily adapted to Shapley-Taylor indices and one-hot-encoded features.
Modern power systems will have to face difficult challenges in the years to come: frequent blackouts in urban areas caused by high power demand peaks, grid instability exacerbated by intermittent renewable generation, and global climate change amplified by rising carbon emissions. While current practices are growingly inadequate, the path to widespread adoption of artificial intelligence (AI) methods is hindered by missing aspects of trustworthiness. The CityLearn Challenge is an exemplary opportunity for researchers from multiple disciplines to investigate the potential of AI to tackle these pressing issues in the energy domain, collectively modeled as a reinforcement learning (RL) task. Multiple real-world challenges faced by contemporary RL techniques are embodied in the problem formulation. In this paper, we present a novel method using the solution function of optimization as policies to compute actions for sequential decision-making, while notably adapting the parameters of the optimization model from online observations. Algorithmically, this is achieved by an evolutionary algorithm under a novel trajectory-based guidance scheme. Formally, the global convergence property is established. Our agent ranked first in the latest 2021 CityLearn Challenge, being able to achieve superior performance in almost all metrics while maintaining some key aspects of interpretability.
Proximal splitting-based convex optimization is a promising approach to linear inverse problems because we can use some prior knowledge of the unknown variables explicitly. An understanding of the behavior of the optimization algorithms would be important for the tuning of the parameters and the development of new algorithms. In this paper, we first analyze the asymptotic property of the proximity operator for the squared loss function, which appears in the update equations of some proximal splitting methods for linear inverse problems. The analysis shows that the output of the proximity operator can be characterized with a scalar random variable in the large system limit. Moreover, we investigate the asymptotic behavior of the Douglas-Rachford algorithm, which is one of the famous proximal splitting methods. From the resultant conjecture, we can predict the evolution of the mean squared error (MSE) in the algorithm for large-scale linear inverse problems. Simulation results demonstrate that the MSE performance of the Douglas-Rachford algorithm can be well predicted by the analytical result in compressed sensing with the $\ell_{1}$ optimization.
Legged systems have many advantages when compared to their wheeled counterparts. For example, they can more easily navigate extreme, uneven terrain. However, there are disadvantages as well, particularly the difficulty seen in modeling the nonlinearities of the system. Research has shown that using flexible components within legged locomotive systems improves performance measures such as efficiency and running velocity. Because of the difficulties encountered in modeling flexible systems, control methods such as reinforcement learning can be used to define control strategies. Furthermore, reinforcement learning can be tasked with learning mechanical parameters of a system to match a control input. It is shown in this work that when deploying reinforcement learning to find design parameters for a pogo-stick jumping system, the designs the agents learn are optimal within the design space provided to the agents.
The compute-intensive nature of neural networks (NNs) limits their deployment in resource-constrained environments such as cell phones, drones, autonomous robots, etc. Hence, developing robust sparse models fit for safety-critical applications has been an issue of longstanding interest. Though adversarial training with model sparsification has been combined to attain the goal, conventional adversarial training approaches provide no formal guarantee that the models would be robust against any rogue samples in a restricted space around a benign sample. Recently proposed verified local robustness techniques provide such a guarantee. This is the first paper that combines the ideas from verified local robustness and dynamic sparse training to develop `SparseVLR'-- a novel framework to search verified locally robust sparse networks. Obtained sparse models exhibit accuracy and robustness comparable to their dense counterparts at sparsity as high as 99%. Furthermore, unlike most conventional sparsification techniques, SparseVLR does not require a pre-trained dense model, reducing the training time by 50%. We exhaustively investigated SparseVLR's efficacy and generalizability by evaluating various benchmark and application-specific datasets across several models.
Learning precoding policies with neural networks enables low complexity online implementation, robustness to channel impairments, and joint optimization with channel acquisition. However, existing neural networks suffer from high training complexity and poor generalization ability when they are used to learn to optimize precoding for mitigating multi-user interference. This impedes their use in practical systems where the number of users is time-varying. In this paper, we propose a graph neural network (GNN) to learn precoding policies by harnessing both the mathematical model and the property of the policies. We first show that a vanilla GNN cannot well-learn pseudo-inverse of channel matrix when the numbers of antennas and users are large, and is not generalizable to unseen numbers of users. Then, we design a GNN by resorting to the Taylor's expansion of matrix pseudo-inverse, which allows for capturing the importance of the neighbored edges to be aggregated that is crucial for learning precoding policies efficiently. Simulation results show that the proposed GNN can well learn spectral efficient and energy efficient precoding policies in single- and multi-cell multi-user multi-antenna systems with low training complexity, and can be well generalized to the numbers of users.
Spatial perception is a key task in several robotics applications. In general, it involves the nonlinear estimation of hidden variables that represent the state of the robot/environment. However, in the presence of outliers the standard nonlinear least squared formulation results in poor estimates. Several methods have been considered in the literature to improve the reliability of the estimation process. Most methods are based on heuristics since guaranteed global robust estimation is not generally practical due to high computational costs. Recently general purpose robust estimation heuristics have been proposed that leverage existing non-minimal solvers available for the outlier-free formulations without the need for an initial guess. In this work, we propose two similar heuristics backed by Bayesian theory. We evaluate these heuristics in practical scenarios to demonstrate their merits in different applications including 3D point cloud registration, mesh registration and pose graph optimization.
Micron-scale robots (ubots) have recently shown great promise for emerging medical applications, and accurate control of ubots is a critical next step to deploying them in real systems. In this work, we develop the idea of a nonlinear mismatch controller to compensate for the mismatch between the disturbed unicycle model of a rolling ubot and trajectory data collected during an experiment. We exploit the differential flatness property of the rolling ubot model to generate a mapping from the desired state trajectory to nominal control actions. Due to model mismatch and parameter estimation error, the nominal control actions will not exactly reproduce the desired state trajectory. We employ a Gaussian Process (GP) to learn the model mismatch as a function of the desired control actions, and correct the nominal control actions using a least-squares optimization. We demonstrate the performance of our online learning algorithm in simulation, where we show that the model mismatch makes some desired states unreachable. Finally, we validate our approach in an experiment and show that the error metrics are reduced by up to 40%.
Variational Bayes methods are a scalable estimation approach for many complex state space models. However, existing methods exhibit a trade-off between accurate estimation and computational efficiency. This paper proposes a variational approximation that mitigates this trade-off. This approximation is based on importance densities that have been proposed in the context of efficient importance sampling. By directly conditioning on the observed data, the proposed method produces an accurate approximation to the exact posterior distribution. Because the steps required for its calibration are computationally efficient, the approach is faster than existing variational Bayes methods. The proposed method can be applied to any state space model that has a closed-form measurement density function and a state transition distribution that belongs to the exponential family of distributions. We illustrate the method in numerical experiments with stochastic volatility models and a macroeconomic empirical application using a high-dimensional state space model.
Out-of-distribution (OOD) detection is critical to ensuring the reliability and safety of machine learning systems. For instance, in autonomous driving, we would like the driving system to issue an alert and hand over the control to humans when it detects unusual scenes or objects that it has never seen before and cannot make a safe decision. This problem first emerged in 2017 and since then has received increasing attention from the research community, leading to a plethora of methods developed, ranging from classification-based to density-based to distance-based ones. Meanwhile, several other problems are closely related to OOD detection in terms of motivation and methodology. These include anomaly detection (AD), novelty detection (ND), open set recognition (OSR), and outlier detection (OD). Despite having different definitions and problem settings, these problems often confuse readers and practitioners, and as a result, some existing studies misuse terms. In this survey, we first present a generic framework called generalized OOD detection, which encompasses the five aforementioned problems, i.e., AD, ND, OSR, OOD detection, and OD. Under our framework, these five problems can be seen as special cases or sub-tasks, and are easier to distinguish. Then, we conduct a thorough review of each of the five areas by summarizing their recent technical developments. We conclude this survey with open challenges and potential research directions.