Linear partial differential equations (PDEs) are an important, widely applied class of mechanistic models, describing physical processes such as heat transfer, electromagnetism, and wave propagation. In practice, specialized numerical methods based on discretization are used to solve PDEs. They generally use an estimate of the unknown model parameters and, if available, physical measurements for initialization. Such solvers are often embedded into larger scientific models or analyses with a downstream application such that error quantification plays a key role. However, by entirely ignoring parameter and measurement uncertainty, classical PDE solvers may fail to produce consistent estimates of their inherent approximation error. In this work, we approach this problem in a principled fashion by interpreting solving linear PDEs as physics-informed Gaussian process (GP) regression. Our framework is based on a key generalization of a widely-applied theorem for conditioning GPs on a finite number of direct observations to observations made via an arbitrary bounded linear operator. Crucially, this probabilistic viewpoint allows to (1) quantify the inherent discretization error; (2) propagate uncertainty about the model parameters to the solution; and (3) condition on noisy measurements. Demonstrating the strength of this formulation, we prove that it strictly generalizes methods of weighted residuals, a central class of PDE solvers including collocation, finite volume, pseudospectral, and (generalized) Galerkin methods such as finite element and spectral methods. This class can thus be directly equipped with a structured error estimate and the capability to incorporate uncertain model parameters and observations. In summary, our results enable the seamless integration of mechanistic models as modular building blocks into probabilistic models.
We present a method to approximate Gaussian process regression models for large datasets by considering only a subset of the data. Our approach is novel in that the size of the subset is selected on the fly during exact inference with little computational overhead. From an empirical observation that the log-marginal likelihood often exhibits a linear trend once a sufficient subset of a dataset has been observed, we conclude that many large datasets contain redundant information that only slightly affects the posterior. Based on this, we provide probabilistic bounds on the full model evidence that can identify such subsets. Remarkably, these bounds are largely composed of terms that appear in intermediate steps of the standard Cholesky decomposition, allowing us to modify the algorithm to adaptively stop the decomposition once enough data have been observed.
This paper presents $\Psi$-GNN, a novel Graph Neural Network (GNN) approach for solving the ubiquitous Poisson PDE problems with mixed boundary conditions. By leveraging the Implicit Layer Theory, $\Psi$-GNN models an ''infinitely'' deep network, thus avoiding the empirical tuning of the number of required Message Passing layers to attain the solution. Its original architecture explicitly takes into account the boundary conditions, a critical prerequisite for physical applications, and is able to adapt to any initially provided solution. $\Psi$-GNN is trained using a ''physics-informed'' loss, and the training process is stable by design, and insensitive to its initialization. Furthermore, the consistency of the approach is theoretically proven, and its flexibility and generalization efficiency are experimentally demonstrated: the same learned model can accurately handle unstructured meshes of various sizes, as well as different boundary conditions. To the best of our knowledge, $\Psi$-GNN is the first physics-informed GNN-based method that can handle various unstructured domains, boundary conditions and initial solutions while also providing convergence guarantees.
While Gaussian processes are a mainstay for various engineering and scientific applications, the uncertainty estimates don't satisfy frequentist guarantees, and can be miscalibrated in practice. State-of-the-art approaches for designing calibrated models rely on inflating the Gaussian process posterior variance, which yields confidence intervals that are potentially too coarse. To remedy this, we present a calibration approach that generates predictive quantiles using a computation inspired by the vanilla Gaussian process posterior variance, but using a different set of hyperparameters, chosen to satisfy an empirical calibration constraint. This results in a calibration approach that is considerably more flexible than existing approaches. Our approach is shown to yield a calibrated model under reasonable assumptions. Furthermore, it outperforms existing approaches not only when employed for calibrated regression, but also to inform the design of Bayesian optimization algorithms.
Grid-free Monte Carlo methods based on the \emph{walk on spheres (WoS)} algorithm solve fundamental partial differential equations (PDEs) like the Poisson equation without discretizing the problem domain, nor approximating functions in a finite basis. Such methods hence avoid aliasing in the solution, and evade the many challenges of mesh generation. Yet for problems with complex geometry, practical grid-free methods have been largely limited to basic Dirichlet boundary conditions. This paper introduces the \emph{walk on stars (WoSt)} method, which solves linear elliptic PDEs with arbitrary mixed Neumann and Dirichlet boundary conditions. The key insight is that one can efficiently simulate reflecting Brownian motion (which models Neumann conditions) by replacing the balls used by WoS with \emph{star-shaped} domains; we identify such domains by locating the closest visible point on the geometric silhouette. Overall, WoSt retains many attractive features of other grid-free Monte Carlo methods, such as progressive evaluation, trivial parallel implementation, and logarithmic scaling relative to geometric complexity.
Precision medicine is an emerging field that takes into account individual heterogeneity to inform better clinical practice. In clinical trials, the evaluation of treatment effect heterogeneity is an important component, and recently, many statistical methods have been proposed for stratifying patients into different subgroups based on such heterogeneity. However, the majority of existing methods developed for this purpose focus on the case with a dichotomous treatment and are not directly applicable to multi-arm trials. In this paper, we consider the problem of patient stratification in multi-arm trial settings and propose a two-stage procedure within the Bayesian nonparametric framework. Specifically, we first use Bayesian additive regression trees (BART) to predict potential outcomes (treatment responses) under different treatment options for each patient, and then we leverage Bayesian profile regression to cluster patients into subgroups according to their baseline characteristics and predicted potential outcomes. We further embed a variable selection procedure into our proposed framework to identify the patient characteristics that actively "drive" the clustering structure. We conduct simulation studies to examine the performance of our proposed method and demonstrate the method by applying it to a UK-based multi-arm blood donation trial, wherein our method uncovers five clinically meaningful donor subgroups.
This work introduces a refinement of the Parsimonious Model for fitting a Gaussian Mixture. The improvement is based on the consideration of groupings of the covariance matrices according to a criterion, such as sharing Principal Directions. This and other similarity criteria that arise from the spectral decomposition of a matrix are the bases of the Parsimonious Model. The classification can be achieved with simple modifications of the CEM (Classification Expectation Maximization) algorithm, using in the M step suitable estimation methods known for parsimonious models. This approach leads to propose Gaussian Mixture Models for model-based clustering and discriminant analysis, in which covariance matrices are clustered according to a parsimonious criterion, creating intermediate steps between the fourteen widely known parsimonious models. The added versatility not only allows us to obtain models with fewer parameters for fitting the data, but also provides greater interpretability. We show its usefulness for model-based clustering and discriminant analysis, providing algorithms to find approximate solutions verifying suitable size, shape and orientation constraints, and applying them to both simulation and real data examples.
Kernel methods are learning algorithms that enjoy solid theoretical foundations while suffering from important computational limitations. Sketching, which consists in looking for solutions among a subspace of reduced dimension, is a well studied approach to alleviate these computational burdens. However, statistically-accurate sketches, such as the Gaussian one, usually contain few null entries, such that their application to kernel methods and their non-sparse Gram matrices remains slow in practice. In this paper, we show that sparsified Gaussian (and Rademacher) sketches still produce theoretically-valid approximations while allowing for important time and space savings thanks to an efficient \emph{decomposition trick}. To support our method, we derive excess risk bounds for both single and multiple output kernel problems, with generic Lipschitz losses, hereby providing new guarantees for a wide range of applications, from robust regression to multiple quantile regression. Our theoretical results are complemented with experiments showing the empirical superiority of our approach over SOTA sketching methods.
In this work, we formulate and analyze a geometric multigrid method for the iterative solution of the discrete systems arising from the finite element discretization of symmetric second-order linear elliptic diffusion problems. We show that the iterative solver contracts the algebraic error robustly with respect to the polynomial degree $p \ge 1$ and the (local) mesh size $h$. We further prove that the built-in algebraic error estimator which comes with the solver is $hp$-robustly equivalent to the algebraic error. The application of the solver within the framework of adaptive finite element methods with quasi-optimal computational cost is outlined. Numerical experiments confirm the theoretical findings.
Spatial Gaussian process regression models typically contain finite dimensional covariance parameters that need to be estimated from the data. We study the Bayesian estimation of covariance parameters including the nugget parameter in a general class of stationary covariance functions under fixed-domain asymptotics, which is theoretically challenging due to the increasingly strong dependence among spatial observations. We propose a novel adaptation of the Schwartz's consistency theorem for showing posterior contraction rates of the covariance parameters including the nugget. We derive a new polynomial evidence lower bound, and propose consistent higher-order quadratic variation estimators that satisfy concentration inequalities with exponentially small tails. Our Bayesian fixed-domain asymptotics theory leads to explicit posterior contraction rates for the microergodic and nugget parameters in the isotropic Matern covariance function under a general stratified sampling design. We verify our theory and the Bayesian predictive performance in simulation studies and an application to sea surface temperature data.
Oceanographers are interested in predicting ocean currents and identifying divergences in a current vector field based on sparse observations of buoy velocities. Since we expect current dynamics to be smooth but highly non-linear, Gaussian processes (GPs) offer an attractive model. But we show that applying a GP with a standard stationary kernel directly to buoy data can struggle at both current prediction and divergence identification -- due to some physically unrealistic prior assumptions. To better reflect known physical properties of currents, we propose to instead put a standard stationary kernel on the divergence and curl-free components of a vector field obtained through a Helmholtz decomposition. We show that, because this decomposition relates to the original vector field just via mixed partial derivatives, we can still perform inference given the original data with only a small constant multiple of additional computational expense. We illustrate the benefits of our method on synthetic and real ocean data.