Any experiment with climate models relies on a potentially large set of spatio-temporal boundary conditions. These can represent both the initial state of the system and/or forcings driving the model output throughout the experiment. Whilst these boundary conditions are typically fixed using available reconstructions in climate modelling studies, they are highly uncertain, that uncertainty is unquantified, and the effect on the output of the experiment can be considerable. We develop efficient quantification of these uncertainties that combines relevant data from multiple models and observations. Starting from the coexchangeability model, we develop a coexchangable process model to capture multiple correlated spatio-temporal fields of variables. We demonstrate that further exchangeability judgements over the parameters within this representation lead to a Bayes linear analogy of a hierarchical model. We use the framework to provide a joint reconstruction of sea-surface temperature and sea-ice concentration boundary conditions at the last glacial maximum (19-23 ka) and use it to force an ensemble of ice-sheet simulations using the FAMOUS-Ice coupled atmosphere and ice-sheet model. We demonstrate that existing boundary conditions typically used in these experiments are implausible given our uncertainties and demonstrate the impact of using more plausible boundary conditions on ice-sheet simulation.
This paper considers the problem of measure estimation under the barycentric coding model (BCM), in which an unknown measure is assumed to belong to the set of Wasserstein-2 barycenters of a finite set of known measures. Estimating a measure under this model is equivalent to estimating the unknown barycenteric coordinates. We provide novel geometrical, statistical, and computational insights for measure estimation under the BCM, consisting of three main results. Our first main result leverages the Riemannian geometry of Wasserstein-2 space to provide a procedure for recovering the barycentric coordinates as the solution to a quadratic optimization problem assuming access to the true reference measures. The essential geometric insight is that the parameters of this quadratic problem are determined by inner products between the optimal displacement maps from the given measure to the reference measures defining the BCM. Our second main result then establishes an algorithm for solving for the coordinates in the BCM when all the measures are observed empirically via i.i.d. samples. We prove precise rates of convergence for this algorithm -- determined by the smoothness of the underlying measures and their dimensionality -- thereby guaranteeing its statistical consistency. Finally, we demonstrate the utility of the BCM and associated estimation procedures in three application areas: (i) covariance estimation for Gaussian measures; (ii) image processing; and (iii) natural language processing.
A key challenge for robotic systems is to figure out the behavior of another agent. The capability to draw correct inferences is crucial to derive human behavior from examples. Processing correct inferences is especially challenging when (confounding) factors are not controlled experimentally (observational evidence). For this reason, robots that rely on inferences that are correlational risk a biased interpretation of the evidence. We propose equipping robots with the necessary tools to conduct observational studies on people. Specifically, we propose and explore the feasibility of structural causal models with non-parametric estimators to derive empirical estimates on hand behavior in the context of object manipulation in a virtual kitchen scenario. In particular, we focus on inferences under (the weaker) conditions of partial confounding (the model covering only some factors) and confront estimators with hundreds of samples instead of the typical order of thousands. Studying these conditions explores the boundaries of the approach and its viability. Despite the challenging conditions, the estimates inferred from the validation data are correct. Moreover, these estimates are stable against three refutation strategies where four estimators are in agreement. Furthermore, the causal quantity for two individuals reveals the sensibility of the approach to detect positive and negative effects. The validity, stability and explainability of the approach are encouraging and serve as the foundation for further research.
Monitoring machine learning models once they are deployed is challenging. It is even more challenging to decide when to retrain models in real-case scenarios when labeled data is beyond reach, and monitoring performance metrics becomes unfeasible. In this work, we use non-parametric bootstrapped uncertainty estimates and SHAP values to provide explainable uncertainty estimation as a technique that aims to monitor the deterioration of machine learning models in deployment environments, as well as determine the source of model deterioration when target labels are not available. Classical methods are purely aimed at detecting distribution shift, which can lead to false positives in the sense that the model has not deteriorated despite a shift in the data distribution. To estimate model uncertainty we construct prediction intervals using a novel bootstrap method, which improves upon the work of Kumar & Srivastava (2012). We show that both our model deterioration detection system as well as our uncertainty estimation method achieve better performance than the current state-of-the-art. Finally, we use explainable AI techniques to gain an understanding of the drivers of model deterioration. We release an open source Python package, doubt, which implements our proposed methods, as well as the code used to reproduce our experiments.
In this thesis a probabilistic framework is developed and proposed for Dynamic Object Recognition in 3D Environments. A software package is developed using C++ and Python in ROS that performs the detection and tracking task. Furthermore, a novel Gaussian Process Regression (GPR) based method is developed to detect ground points in different urban scenarios of regular, sloped and rough. The ground surface behavior is assumed to only demonstrate local input-dependent smoothness. kernel's length-scales are obtained. Bayesian inference is implemented sing \textit{Maximum a Posteriori} criterion. The log-marginal likelihood function is assumed to be a multi-task objective function, to represent a whole-frame unbiased view of the ground at each frame because adjacent segments may not have similar ground structure in an uneven scene while having shared hyper-parameter values. Simulation results shows the effectiveness of the proposed method in uneven and rough scenes which outperforms similar Gaussian process based ground segmentation methods.
We consider a network of agents. Associated with each agent are her covariate and outcome. Agents influence each other's outcomes according to a certain connection/influence structure. A subset of the agents participate on a platform, and hence, are observable to it. The rest are not observable to the platform and are called the latent agents. The platform does not know the influence structure of the observable or the latent parts of the network. It only observes the data on past covariates and decisions of the observable agents. Observable agents influence each other both directly and indirectly through the influence they exert on the latent agents. We investigate how the platform can estimate the dependence of the observable agents' outcomes on their covariates, taking the latent agents into account. First, we show that this relationship can be succinctly captured by a matrix and provide an algorithm for estimating it under a suitable approximate sparsity condition using historical data of covariates and outcomes for the observable agents. We also obtain convergence rates for the proposed estimator despite the high dimensionality that allows more agents than observations. Second, we show that the approximate sparsity condition holds under the standard conditions used in the literature. Hence, our results apply to a large class of networks. Finally, we apply our results to two practical settings: targeted advertising and promotional pricing. We show that by using the available historical data with our estimator, it is possible to obtain asymptotically optimal advertising/pricing decisions, despite the presence of latent agents.
We introduce a novel, probabilistic binary latent variable model to detect noisy or approximate repeats of patterns in sparse binary data. The model is based on the "Noisy-OR model" (Heckerman, 1990), used previously for disease and topic modelling. The model's capability is demonstrated by extracting structure in recordings from retinal neurons, but it can be widely applied to discover and model latent structure in noisy binary data. In the context of spiking neural data, the task is to "explain" spikes of individual neurons in terms of groups of neurons, "Cell Assemblies" (CAs), that often fire together, due to mutual interactions or other causes. The model infers sparse activity in a set of binary latent variables, each describing the activity of a cell assembly. When the latent variable of a cell assembly is active, it reduces the probabilities of neurons belonging to this assembly to be inactive. The conditional probability kernels of the latent components are learned from the data in an expectation maximization scheme, involving inference of latent states and parameter adjustments to the model. We thoroughly validate the model on synthesized spike trains constructed to statistically resemble recorded retinal responses to white noise stimulus and natural movie stimulus in data. We also apply our model to spiking responses recorded in retinal ganglion cells (RGCs) during stimulation with a movie and discuss the found structure.
Gaussian processes are among the most useful tools in modeling continuous processes in machine learning and statistics. If the value of a process is known at a finite collection of points, one may use Gaussian processes to construct a surface which interpolates these values to be used for prediction and uncertainty quantification in other locations. However, it is not always the case that the available information is in the form of a finite collection of points. For example, boundary value problems contain information on the boundary of a domain, which is an uncountable collection of points that cannot be incorporated into typical Gaussian process techniques. In this paper we construct a Gaussian process model which utilizes reproducing kernel Hilbert spaces to unify the typical finite case with the case of having uncountable information by exploiting the equivalence of conditional expectation and orthogonal projections. We discuss this construction in statistical models, including numerical considerations and a proof of concept.
The $k$-center problem is to choose a subset of size $k$ from a set of $n$ points such that the maximum distance from each point to its nearest center is minimized. Let $Q=\{Q_1,\ldots,Q_n\}$ be a set of polygons or segments in the region-based uncertainty model, in which each $Q_i$ is an uncertain point, where the exact locations of the points in $Q_i$ are unknown. The geometric objects segments and polygons can be models of a point set. We define the uncertain version of the $k$-center problem as a generalization in which the objective is to find $k$ points from $Q$ to cover the remaining regions of $Q$ with minimum or maximum radius of the cluster to cover at least one or all exact instances of each $Q_i$, respectively. We modify the region-based model to allow multiple points to be chosen from a region and call the resulting model the aggregated uncertainty model. All these problems contain the point version as a special case, so they are all NP-hard with a lower bound 1.822. We give approximation algorithms for uncertain $k$-center of a set of segments and polygons. We also have implemented some of our algorithms on a data-set to show our theoretical performance guarantees can be achieved in practice.
We present a hybrid sampling-surrogate approach for reducing the computational expense of uncertainty quantification in nonlinear dynamical systems. Our motivation is to enable rapid uncertainty quantification in complex mechanical systems such as automotive propulsion systems. Our approach is to build upon ideas from multifidelity uncertainty quantification to leverage the benefits of both sampling and surrogate modeling, while mitigating their downsides. In particular, the surrogate model is selected to exploit problem structure, such as smoothness, and offers a highly correlated information source to the original nonlinear dynamical system. We utilize an intrusive generalized Polynomial Chaos surrogate because it avoids any statistical errors in its construction and provides analytic estimates of output statistics. We then leverage a Monte Carlo-based Control Variate technique to correct the bias caused by the surrogate approximation error. The primary theoretical contribution of this work is the analysis and solution of an estimator design strategy that optimally balances the computational effort needed to adapt a surrogate compared with sampling the original expensive nonlinear system. While previous works have similarly combined surrogates and sampling, to our best knowledge this work is the first to provide rigorous analysis of estimator design. We deploy our approach on multiple examples stemming from the simulation of mechanical automotive propulsion system models. We show that the estimator is able to achieve orders of magnitude reduction in mean squared error of statistics estimation in some cases under comparable costs of purely sampling or purely surrogate approaches.
Despite the state-of-the-art performance for medical image segmentation, deep convolutional neural networks (CNNs) have rarely provided uncertainty estimations regarding their segmentation outputs, e.g., model (epistemic) and image-based (aleatoric) uncertainties. In this work, we analyze these different types of uncertainties for CNN-based 2D and 3D medical image segmentation tasks. We additionally propose a test-time augmentation-based aleatoric uncertainty to analyze the effect of different transformations of the input image on the segmentation output. Test-time augmentation has been previously used to improve segmentation accuracy, yet not been formulated in a consistent mathematical framework. Hence, we also propose a theoretical formulation of test-time augmentation, where a distribution of the prediction is estimated by Monte Carlo simulation with prior distributions of parameters in an image acquisition model that involves image transformations and noise. We compare and combine our proposed aleatoric uncertainty with model uncertainty. Experiments with segmentation of fetal brains and brain tumors from 2D and 3D Magnetic Resonance Images (MRI) showed that 1) the test-time augmentation-based aleatoric uncertainty provides a better uncertainty estimation than calculating the test-time dropout-based model uncertainty alone and helps to reduce overconfident incorrect predictions, and 2) our test-time augmentation outperforms a single-prediction baseline and dropout-based multiple predictions.