In this paper, the problem of state estimation, in the context of both filtering and smoothing, for nonlinear state-space models is considered. Due to the nonlinear nature of the models, the state estimation problem is generally intractable as it involves integrals of general nonlinear functions and the filtered and smoothed state distributions lack closed-form solutions. As such, it is common to approximate the state estimation problem. In this paper, we develop an assumed Gaussian solution based on variational inference, which offers the key advantage of a flexible, but principled, mechanism for approximating the required distributions. Our main contribution lies in a new formulation of the state estimation problem as an optimisation problem, which can then be solved using standard optimisation routines that employ exact first- and second-order derivatives. The resulting state estimation approach involves a minimal number of assumptions and applies directly to nonlinear systems with both Gaussian and non-Gaussian probabilistic models. The performance of our approach is demonstrated on several examples; a challenging scalar system, a model of a simple robotic system, and a target tracking problem using a von Mises-Fisher distribution and outperforms alternative assumed Gaussian approaches to state estimation.
In statistical learning and analysis from shared data, which is increasingly widely adopted in platforms such as federated learning and meta-learning, there are two major concerns: privacy and robustness. Each participating individual should be able to contribute without the fear of leaking one's sensitive information. At the same time, the system should be robust in the presence of malicious participants inserting corrupted data. Recent algorithmic advances in learning from shared data focus on either one of these threats, leaving the system vulnerable to the other. We bridge this gap for the canonical problem of estimating the mean from i.i.d. samples. We introduce PRIME, which is the first efficient algorithm that achieves both privacy and robustness for a wide range of distributions. We further complement this result with a novel exponential time algorithm that improves the sample complexity of PRIME, achieving a near-optimal guarantee and matching a known lower bound for (non-robust) private mean estimation. This proves that there is no extra statistical cost to simultaneously guaranteeing privacy and robustness.
This paper considers a novel multi-agent linear stochastic approximation algorithm driven by Markovian noise and general consensus-type interaction, in which each agent evolves according to its local stochastic approximation process which depends on the information from its neighbors. The interconnection structure among the agents is described by a time-varying directed graph. While the convergence of consensus-based stochastic approximation algorithms when the interconnection among the agents is described by doubly stochastic matrices (at least in expectation) has been studied, less is known about the case when the interconnection matrix is simply stochastic. For any uniformly strongly connected graph sequences whose associated interaction matrices are stochastic, the paper derives finite-time bounds on the mean-square error, defined as the deviation of the output of the algorithm from the unique equilibrium point of the associated ordinary differential equation. For the case of interconnection matrices being stochastic, the equilibrium point can be any unspecified convex combination of the local equilibria of all the agents in the absence of communication. Both the cases with constant and time-varying step-sizes are considered. In the case when the convex combination is required to be a straight average and interaction between any pair of neighboring agents may be uni-directional, so that doubly stochastic matrices cannot be implemented in a distributed manner, the paper proposes a push-sum-type distributed stochastic approximation algorithm and provides its finite-time bound for the time-varying step-size case by leveraging the analysis for the consensus-type algorithm with stochastic matrices and developing novel properties of the push-sum algorithm.
In this paper novel simulation methods are provided for the generalised inverse Gaussian (GIG) L\'{e}vy process. Such processes are intractable for simulation except in certain special edge cases, since the L\'{e}vy density associated with the GIG process is expressed as an integral involving certain Bessel Functions, known as the Jaeger integral in diffusive transport applications. We here show for the first time how to solve the problem indirectly, using generalised shot-noise methods to simulate the underlying point processes and constructing an auxiliary variables approach that avoids any direct calculation of the integrals involved. The resulting augmented bivariate process is still intractable and so we propose a novel thinning method based on upper bounds on the intractable integrand. Moreover our approach leads to lower and upper bounds on the Jaeger integral itself, which may be compared with other approximation methods. The shot noise method involves a truncated infinite series of decreasing random variables, and as such is approximate, although the series are found to be rapidly convergent in most cases. We note that the GIG process is the required Brownian motion subordinator for the generalised hyperbolic (GH) L\'{e}vy process and so our simulation approach will straightforwardly extend also to the simulation of these intractable proceses. Our new methods will find application in forward simulation of processes of GIG and GH type, in financial and engineering data, for example, as well as inference for states and parameters of stochastic processes driven by GIG and GH L\'{e}vy processes.
This thesis is mainly concerned with state-space approaches for solving deep (temporal) Gaussian process (DGP) regression problems. More specifically, we represent DGPs as hierarchically composed systems of stochastic differential equations (SDEs), and we consequently solve the DGP regression problem by using state-space filtering and smoothing methods. The resulting state-space DGP (SS-DGP) models generate a rich class of priors compatible with modelling a number of irregular signals/functions. Moreover, due to their Markovian structure, SS-DGPs regression problems can be solved efficiently by using Bayesian filtering and smoothing methods. The second contribution of this thesis is that we solve continuous-discrete Gaussian filtering and smoothing problems by using the Taylor moment expansion (TME) method. This induces a class of filters and smoothers that can be asymptotically exact in predicting the mean and covariance of stochastic differential equations (SDEs) solutions. Moreover, the TME method and TME filters and smoothers are compatible with simulating SS-DGPs and solving their regression problems. Lastly, this thesis features a number of applications of state-space (deep) GPs. These applications mainly include, (i) estimation of unknown drift functions of SDEs from partially observed trajectories and (ii) estimation of spectro-temporal features of signals.
We study the Bayesian inverse problem for inferring the log-normal slowness function of the eikonal equation given noisy observation data on its solution at a set of spatial points. We study approximation of the posterior probability measure by solving the truncated eikonal equation, which contains only a finite number of terms in the Karhunen-Loeve expansion of the slowness function, by the Fast Marching Method. The error of this approximation in the Hellinger metric is deduced in terms of the truncation level of the slowness and the grid size in the Fast Marching Method resolution. It is well known that the plain Markov Chain Monte Carlo procedure for sampling the posterior probability is highly expensive. We develop and justify the convergence of a Multilevel Markov Chain Monte Carlo method. Using the heap sort procedure in solving the forward eikonal equation by the Fast Marching Method, our Multilevel Markov Chain Monte Carlo method achieves a prescribed level of accuracy for approximating the posterior expectation of quantities of interest, requiring only an essentially optimal level of complexity. Numerical examples confirm the theoretical results.
Sufficient conditions are provided under which the log-likelihood ratio test statistic fails to have a limiting chi-squared distribution under the null hypothesis when testing between one and two components under a general two-component mixture model, but rather tends to infinity in probability. These conditions are verified when the component densities describe continuous-time, discrete-statespace Markov chains and the results are illustrated via a parametric bootstrap simulation on an analysis of the migrations over time of a set of corporate bonds ratings. The precise limiting distribution is derived in a simple case with two states, one of which is absorbing which leads to a right-censored exponential scale mixture model. In that case, when centred by a function growing logarithmically in the sample size, the statistic has a limiting distribution of Gumbel extreme-value type rather than chi-squared.
Regression models are used in a wide range of applications providing a powerful scientific tool for researchers from different fields. Linear, or simple parametric, models are often not sufficient to describe complex relationships between input variables and a response. Such relationships can be better described through flexible approaches such as neural networks, but this results in less interpretable models and potential overfitting. Alternatively, specific parametric nonlinear functions can be used, but the specification of such functions is in general complicated. In this paper, we introduce a flexible approach for the construction and selection of highly flexible nonlinear parametric regression models. Nonlinear features are generated hierarchically, similarly to deep learning, but have additional flexibility on the possible types of features to be considered. This flexibility, combined with variable selection, allows us to find a small set of important features and thereby more interpretable models. Within the space of possible functions, a Bayesian approach, introducing priors for functions based on their complexity, is considered. A genetically modified mode jumping Markov chain Monte Carlo algorithm is adopted to perform Bayesian inference and estimate posterior probabilities for model averaging. In various applications, we illustrate how our approach is used to obtain meaningful nonlinear models. Additionally, we compare its predictive performance with several machine learning algorithms.
Traditional quantile estimators that are based on one or two order statistics are a common way to estimate distribution quantiles based on the given samples. These estimators are robust, but their statistical efficiency is not always good enough. A more efficient alternative is the Harrell-Davis quantile estimator which uses a weighted sum of all order statistics. Whereas this approach provides more accurate estimations for the light-tailed distributions, it's not robust. To be able to customize the trade-off between statistical efficiency and robustness, we could consider a trimmed modification of the Harrell-Davis quantile estimator. In this approach, we discard order statistics with low weights according to the highest density interval of the beta distribution.
We present R-LINS, a lightweight robocentric lidar-inertial state estimator, which estimates robot ego-motion using a 6-axis IMU and a 3D lidar in a tightly-coupled scheme. To achieve robustness and computational efficiency even in challenging environments, an iterated error-state Kalman filter (ESKF) is designed, which recursively corrects the state via repeatedly generating new corresponding feature pairs. Moreover, a novel robocentric formulation is adopted in which we reformulate the state estimator concerning a moving local frame, rather than a fixed global frame as in the standard world-centric lidar-inertial odometry(LIO), in order to prevent filter divergence and lower computational cost. To validate generalizability and long-time practicability, extensive experiments are performed in indoor and outdoor scenarios. The results indicate that R-LINS outperforms lidar-only and loosely-coupled algorithms, and achieve competitive performance as the state-of-the-art LIO with close to an order-of-magnitude improvement in terms of speed.
We develop an approach to risk minimization and stochastic optimization that provides a convex surrogate for variance, allowing near-optimal and computationally efficient trading between approximation and estimation error. Our approach builds off of techniques for distributionally robust optimization and Owen's empirical likelihood, and we provide a number of finite-sample and asymptotic results characterizing the theoretical performance of the estimator. In particular, we show that our procedure comes with certificates of optimality, achieving (in some scenarios) faster rates of convergence than empirical risk minimization by virtue of automatically balancing bias and variance. We give corroborating empirical evidence showing that in practice, the estimator indeed trades between variance and absolute performance on a training sample, improving out-of-sample (test) performance over standard empirical risk minimization for a number of classification problems.