In many applications that involve the inference of an unknown smooth function, the inference of its derivatives will often be just as important as that of the function itself. To make joint inferences of the function and its derivatives, a class of Gaussian processes called $p^{\text{th}}$ order Integrated Wiener's Process (IWP), is considered. Methods for constructing a finite element (FEM) approximation of an IWP exist but have focused only on the order $p = 2$ case which does not allow appropriate inference for derivatives, and their computational feasibility relies on additional approximation to the FEM itself. In this article, we propose an alternative FEM approximation, called overlapping splines (O-spline), which pursues computational feasibility directly through the choice of test functions, and mirrors the construction of an IWP as the Ospline results from the multiple integrations of these same test functions. The O-spline approximation applies for any order $p \in \mathbb{Z}^+$, is computationally efficient and provides consistent inference for all derivatives up to order $p-1$. It is shown both theoretically, and empirically through simulation, that the O-spline approximation converges to the true IWP as the number of knots increases. We further provide a unified and interpretable way to define priors for the smoothing parameter based on the notion of predictive standard deviation (PSD), which is invariant to the order $p$ and the placement of the knot. Finally, we demonstrate the practical use of the O-spline approximation through simulation studies and an analysis of COVID death rates where the inference is carried on both the function and its derivatives where the latter has an important interpretation in terms of the course of the pandemic.
In this paper we consider an approach to improve the performance of exponential integrators/Lawson schemes in cases where the solution of a related, but usually much simpler, problem can be computed efficiently. While for implicit methods such an approach is common (e.g. by using preconditioners), for exponential integrators this has proven more challenging. Here we propose to extract a constant coefficient differential operator from advection-diffusion-reaction equations for which we are then able to compute the required matrix functions efficiently. Both a linear stability analysis and numerical experiments show that the resulting schemes can be unconditionally stable. In fact, we find that exponential integrators and Lawson schemes can have better stability properties than similarly constructed implicit-explicit schemes. We also propose new Lawson type integrators that further improve on these stability properties. The effectiveness of the approach is highlighted by a number of numerical examples in two and three space dimensions.
A class of generative models that unifies flow-based and diffusion-based methods is introduced. These models extend the framework proposed in Albergo & Vanden-Eijnden (2023), enabling the use of a broad class of continuous-time stochastic processes called `stochastic interpolants' to bridge any two arbitrary probability density functions exactly in finite time. These interpolants are built by combining data from the two prescribed densities with an additional latent variable that shapes the bridge in a flexible way. The time-dependent probability density function of the stochastic interpolant is shown to satisfy a first-order transport equation as well as a family of forward and backward Fokker-Planck equations with tunable diffusion. Upon consideration of the time evolution of an individual sample, this viewpoint immediately leads to both deterministic and stochastic generative models based on probability flow equations or stochastic differential equations with an adjustable level of noise. The drift coefficients entering these models are time-dependent velocity fields characterized as the unique minimizers of simple quadratic objective functions, one of which is a new objective for the score of the interpolant density. Remarkably, we show that minimization of these quadratic objectives leads to control of the likelihood for any of our generative models built upon stochastic dynamics. By contrast, we establish that generative models based upon a deterministic dynamics must, in addition, control the Fisher divergence between the target and the model. We also construct estimators for the likelihood and the cross-entropy of interpolant-based generative models, discuss connections with other stochastic bridges, and demonstrate that such models recover the Schr\"odinger bridge between the two target densities when explicitly optimizing over the interpolant.
We consider the case of performing Bayesian inference for stochastic epidemic compartment models, using incomplete time course data consisting of incidence counts that are either the number of new infections or removals in time intervals of fixed length. We eschew the most natural Markov jump process representation for reasons of computational efficiency, and focus on a stochastic differential equation representation. This is further approximated to give a tractable Gaussian process, that is, the linear noise approximation (LNA). Unless the observation model linking the LNA to data is both linear and Gaussian, the observed data likelihood remains intractable. It is in this setting that we consider two approaches for marginalising over the latent process: a correlated pseudo-marginal method and analytic marginalisation via a Gaussian approximation of the observation model. We compare and contrast these approaches using synthetic data before applying the best performing method to real data consisting of removal incidence of oak processionary moth nests in Richmond Park, London. Our approach further allows comparison between various competing compartment models.
Bayesian inference tasks continue to pose a computational challenge. This especially holds for spatial-temporal modeling where high-dimensional latent parameter spaces are ubiquitous. The methodology of integrated nested Laplace approximations (INLA) provides a framework for performing Bayesian inference applicable to a large subclass of additive Bayesian hierarchical models. In combination with the stochastic partial differential equations (SPDE) approach it gives rise to an efficient method for spatial-temporal modeling. In this work we build on the INLA-SPDE approach, by putting forward a performant distributed memory variant, INLA-DIST, for large-scale applications. To perform the arising computational kernel operations, consisting of Cholesky factorizations, solving linear systems, and selected matrix inversions, we present two numerical solver options, a sparse CPU-based library and a novel blocked GPU-accelerated approach which we propose. We leverage the recurring nonzero block structure in the arising precision (inverse covariance) matrices, which allows us to employ dense subroutines within a sparse setting. Both versions of INLA-DIST are highly scalable, capable of performing inference on models with millions of latent parameters. We demonstrate their accuracy and performance on synthetic as well as real-world climate dataset applications.
Measurement error (ME) and missing values in covariates are often unavoidable in disciplines that deal with data, and both problems have separately received considerable attention during the past decades. However, while most researchers are familiar with methods for treating missing data, accounting for ME in covariates of regression models is less common. In addition, ME and missing data are typically treated as two separate problems, despite practical and theoretical similarities. Here, we exploit the fact that missing data in a continuous covariate is an extreme case of classical ME, allowing us to use existing methodology that accounts for ME via a Bayesian framework that employs integrated nested Laplace approximations (INLA), and thus to simultaneously account for both ME and missing data in the same covariate. As a useful by-product, we present an approach to handle missing data in INLA, since this corresponds to the special case when no ME is present. In addition, we show how to account for Berkson ME in the same framework. In its broadest generality, the proposed joint Bayesian framework can thus account for Berkson ME, classical ME, and missing data, or for any combination of these in the same or different continuous covariates of the family of regression models that are feasible with INLA. The approach is exemplified using both simulated and real data. We provide extensive and fully reproducible Supplementary Material with thoroughly documented examples using {R-INLA} and {inlabru}.
The sum-rank metric is a hybrid between the Hamming metric and the rank metric and suitable for error correction in multishot network coding and distributed storage as well as for the design of quantum-resistant cryptosystems. In this work, we consider the construction and decoding of folded linearized Reed-Solomon (FLRS) codes, which are shown to be maximum sum-rank distance (MSRD) for appropriate parameter choices. We derive an efficient interpolation-based decoding algorithm for FLRS codes that can be used as a list decoder or as a probabilistic unique decoder. The proposed decoding scheme can correct sum-rank errors beyond the unique decoding radius with a computational complexity that is quadratic in the length of the unfolded code. We show how the error-correction capability can be optimized for high-rate codes by an alternative choice of interpolation points. We derive a heuristic upper bound on the decoding failure probability of the probabilistic unique decoder and verify its tightness by Monte Carlo simulations. Further, we study the construction and decoding of folded skew Reed-Solomon codes in the skew metric. Up to our knowledge, FLRS codes are the first MSRD codes with different block sizes that come along with an efficient decoding algorithm.
We present a fast iterative solver for scattering problems in 2D, where a penetrable object with compact support is considered. By representing the scattered field as a volume potential in terms of the Green's function, we arrive at the Lippmann-Schwinger equation in integral form, which is then discretized using an appropriate quadrature technique. The discretized linear system is then solved using an iterative solver accelerated by Directional Algebraic Fast Multipole Method (DAFMM). The DAFMM presented here relies on the directional admissibility condition of the 2D Helmholtz kernel. And the construction of low-rank factorizations of the appropriate low-rank matrix sub-blocks is based on our new Nested Cross Approximation (NCA)~\cite{ arXiv:2203.14832 [math.NA]}. The advantage of our new NCA is that the search space of so-called far-field pivots is smaller than that of the existing NCAs. Another significant contribution of this work is the use of HODLR based direct solver as a preconditioner to further accelerate the iterative solver. In one of our numerical experiments, the iterative solver does not converge without a preconditioner. We show that the HODLR preconditioner is capable of solving problems that the iterative solver can not. Another noteworthy contribution of this article is that we perform a comparative study of the HODLR based fast direct solver, DAFMM based fast iterative solver, and HODLR preconditioned DAFMM based fast iterative solver for the discretized Lippmann-Schwinger problem. To the best of our knowledge, this work is one of the first to provide a systematic study and comparison of these different solvers for various problem sizes and contrast functions. In the spirit of reproducible computational science, the implementation of the algorithms developed in this article is made available at \url{//github.com/vaishna77/Lippmann_Schwinger_Solver}.
We present a study of a kernel-based two-sample test statistic related to the Maximum Mean Discrepancy (MMD) in the manifold data setting, assuming that high-dimensional observations are close to a low-dimensional manifold. We characterize the test level and power in relation to the kernel bandwidth, the number of samples, and the intrinsic dimensionality of the manifold. Specifically, we show that when data densities are supported on a $d$-dimensional sub-manifold $\mathcal{M}$ embedded in an $m$-dimensional space, the kernel two-sample test for data sampled from a pair of distributions $p$ and $q$ that are H\"older with order $\beta$ (up to 2) is powerful when the number of samples $n$ is large such that $\Delta_2 \gtrsim n^{- { 2 \beta/( d + 4 \beta ) }}$, where $\Delta_2$ is the squared $L^2$-divergence between $p$ and $q$ on manifold. We establish a lower bound on the test power for finite $n$ that is sufficiently large, where the kernel bandwidth parameter $\gamma$ scales as $n^{-1/(d+4\beta)}$. The analysis extends to cases where the manifold has a boundary, and the data samples contain high-dimensional additive noise. Our results indicate that the kernel two-sample test does not have a curse-of-dimensionality when the data lie on or near a low-dimensional manifold. We validate our theory and the properties of the kernel test for manifold data through a series of numerical experiments.
The solution of the governing equation representing the drawdown in a horizontal confined aquifer, where groundwater flow is unsteady, is provided in terms of the exponential integral, which is famously known as the Well function. For the computation of this function in practical applications, it is important to develop not only accurate but also a simple approximation that requires evaluation of the fewest possible terms. To that end, introducing Ramanujan's series expression, this work proposes a full-range approximation to the exponential integral using Ramanujan's series for the small argument (u \leq 1) and an approximation based on the bound of the integral for the other range (u \in (1,100]). The evaluation of the proposed approximation results in the most accurate formulae compared to the existing studies, which possess the maximum percentage error of 0.05\%. Further, the proposed formula is much simpler to apply as it contains just the product of exponential and logarithm functions. To further check the efficiency of the proposed approximation, we consider a practical example for evaluating the discrete pumping kernel, which shows the superiority of this approximation over the others. Finally, the authors hope that the proposed efficient approximation can be useful for groundwater and hydrogeological applications.
Classic algorithms and machine learning systems like neural networks are both abundant in everyday life. While classic computer science algorithms are suitable for precise execution of exactly defined tasks such as finding the shortest path in a large graph, neural networks allow learning from data to predict the most likely answer in more complex tasks such as image classification, which cannot be reduced to an exact algorithm. To get the best of both worlds, this thesis explores combining both concepts leading to more robust, better performing, more interpretable, more computationally efficient, and more data efficient architectures. The thesis formalizes the idea of algorithmic supervision, which allows a neural network to learn from or in conjunction with an algorithm. When integrating an algorithm into a neural architecture, it is important that the algorithm is differentiable such that the architecture can be trained end-to-end and gradients can be propagated back through the algorithm in a meaningful way. To make algorithms differentiable, this thesis proposes a general method for continuously relaxing algorithms by perturbing variables and approximating the expectation value in closed form, i.e., without sampling. In addition, this thesis proposes differentiable algorithms, such as differentiable sorting networks, differentiable renderers, and differentiable logic gate networks. Finally, this thesis presents alternative training strategies for learning with algorithms.