We provide a novel dimension-free uniform concentration bound for the empirical risk function of constrained logistic regression. Our bound yields a milder sufficient condition for a uniform law of large numbers than conditions derived by the Rademacher complexity argument and McDiarmid's inequality. The derivation is based on the PAC-Bayes approach with second-order expansion and Rademacher-complexity-based bounds for the residual term of the expansion.
We explore a scaled spectral preconditioner for the efficient solution of sequences of symmetric and positive-definite linear systems. We design the scaled preconditioner not only as an approximation of the inverse of the linear system but also with consideration of its use within the conjugate gradient (CG) method. We propose three different strategies for selecting a scaling parameter, which aims to position the eigenvalues of the preconditioned matrix in a way that reduces the energy norm of the error, the quantity that CG monotonically decreases at each iteration. Our focus is on accelerating convergence especially in the early iterations, which is particularly important when CG is truncated due to computational cost constraints. Numerical experiments provide in data assimilation confirm that the scaled spectral preconditioner can significantly improve early CG convergence with negligible computational cost.
We propose a new step-wise approach to proving observational equivalence, and in particular reasoning about fragility of observational equivalence. Our approach is based on what we call local reasoning. The local reasoning exploits the graphical concept of neighbourhood, and it extracts a new, formal, concept of robustness as a key sufficient condition of observational equivalence. Moreover, our proof methodology is capable of proving a generalised notion of observational equivalence. The generalised notion can be quantified over syntactically restricted contexts instead of all contexts, and also quantitatively constrained in terms of the number of reduction steps. The operational machinery we use is given by a hypergraph-rewriting abstract machine inspired by Girard's Geometry of Interaction. The behaviour of language features, including function abstraction and application, is provided by hypergraph-rewriting rules. We demonstrate our proof methodology using the call-by-value lambda-calculus equipped with (higher-order) state.
We present novel model reduction methods for rapid solution of parametrized nonlinear partial differential equations (PDEs) in real-time or many-query contexts. Our approach combines reduced basis (RB) space for rapidly convergent approximation of the parametric solution manifold, Galerkin projection of the underlying PDEs onto the RB space for dimensionality reduction, and high-order empirical interpolation for efficient treatment of the nonlinear terms. We propose a class of high-order empirical interpolation methods to derive basis functions and interpolation points by using high-order partial derivatives of the nonlinear terms. As these methods can generate high-quality basis functions and interpolation points from a snapshot set of full-order model (FOM) solutions, they significantly improve the approximation accuracy. We develop effective a posteriori estimator to quantify the interpolation errors and construct a parameter sample via greedy sampling. Furthermore, we implement two hyperreduction schemes to construct efficient reduced-order models: one that applies the empirical interpolation before Newton's method and another after. The latter scheme shows flexibility in controlling hyperreduction errors. Numerical results are presented to demonstrate the accuracy and efficiency of the proposed methods.
To obtain the highest confidence on the correction of numerical simulation programs for the resolution of Partial Differential Equations (PDEs), one has to formalize the mathematical notions and results that allow to establish the soundness of the approach. The finite element method is one of the popular tools for the numerical resolution of a wide range of PDEs. The purpose of this document is to provide the formal proof community with very detailed pen-and-paper proofs for the construction of the Lagrange finite elements of any degree on simplices in positive dimension.
We study high-dimensional, ridge-regularized logistic regression in a setting in which the covariates may be missing or corrupted by additive noise. When both the covariates and the additive corruptions are independent and normally distributed, we provide exact characterizations of both the prediction error as well as the estimation error. Moreover, we show that these characterizations are universal: as long as the entries of the data matrix satisfy a set of independence and moment conditions, our guarantees continue to hold. Universality, in turn, enables the detailed study of several imputation-based strategies when the covariates are missing completely at random. We ground our study by comparing the performance of these strategies with the conjectured performance -- stemming from replica theory in statistical physics -- of the Bayes optimal procedure. Our analysis yields several insights including: (i) a distinction between single imputation and a simple variant of multiple imputation and (ii) that adding a simple ridge regularization term to single-imputed logistic regression can yield an estimator whose prediction error is nearly indistinguishable from the Bayes optimal prediction error. We supplement our findings with extensive numerical experiments.
We provide a tight asymptotic characterization of the error exponent for classical-quantum channel coding assisted by activated non-signaling correlations. Namely, we find that the optimal exponent\, -- \,also called reliability function\, -- \,is equal to the well-known sphere packing bound, which can be written as a single-letter formula optimized over Petz-R\'enyi divergences. Remarkably, there is no critical rate and as such our characterization remains tight for arbitrarily low rates below the capacity. On the achievability side, we further extend our results to fully quantum channels. Our proofs rely on semi-definite program duality and a dual representation of the Petz-R\'enyi divergences via Young inequalities. As a result of independent interest, we find that the Petz-R\'enyi divergences of order $\alpha\in[0,2]$ are upper bounded by the sandwiched R\'enyi divergences of order $1/(2-\alpha)\in[1/2,\infty]$.
We introduce GPTreeO, a flexible R package for scalable Gaussian process (GP) regression, particularly tailored to continual learning problems. GPTreeO builds upon the Dividing Local Gaussian Processes (DLGP) algorithm, in which a binary tree of local GP regressors is dynamically constructed using a continual stream of input data. In GPTreeO we extend the original DLGP algorithm by allowing continual optimisation of the GP hyperparameters, incorporating uncertainty calibration, and introducing new strategies for how the local partitions are created. Moreover, the modular code structure allows users to interface their favourite GP library to perform the local GP regression in GPTreeO. The flexibility of GPTreeO gives the user fine-grained control of the balance between computational speed, accuracy, stability and smoothness. We conduct a sensitivity analysis to show how GPTreeO's configurable features impact the regression performance in a continual learning setting.
The Gibbs sampler (a.k.a. Glauber dynamics and heat-bath algorithm) is a popular Markov Chain Monte Carlo algorithm which iteratively samples from the conditional distributions of a probability measure $\pi$ of interest. Under the assumption that $\pi$ is strongly log-concave, we show that the random scan Gibbs sampler contracts in relative entropy and provide a sharp characterization of the associated contraction rate. Assuming that evaluating conditionals is cheap compared to evaluating the joint density, our results imply that the number of full evaluations of $\pi$ needed for the Gibbs sampler to mix grows linearly with the condition number and is independent of the dimension. If $\pi$ is non-strongly log-concave, the convergence rate in entropy degrades from exponential to polynomial. Our techniques are versatile and extend to Metropolis-within-Gibbs schemes and the Hit-and-Run algorithm. A comparison with gradient-based schemes and the connection with the optimization literature are also discussed.
We develop a numerical method for simulation of incompressible viscous flows by integrating the technology of random vortex method with the core idea of Large Eddy Simulation (LES). Specifically, we utilize the filtering method in LES, interpreted as spatial averaging, along with the integral representation theorem for parabolic equations, to achieve a closure scheme which may be used for calculating solutions of Navier-Stokes equations. This approach circumvents the challenge associated with handling the non-locally integrable 3-dimensional integral kernel in the random vortex method and facilitates the computation of numerical solutions for flow systems via Monte-Carlo method. Numerical simulations are carried out for both laminar and turbulent flows, demonstrating the validity and effectiveness of the method.
In molecular dynamics, transport coefficients measure the sensitivity of the invariant probability measure of the stochastic dynamics at hand with respect to some perturbation. They are typically computed using either the linear response of nonequilibrium dynamics, or the Green--Kubo formula. The estimators for both approaches have large variances, which motivates the study of variance reduction techniques for computing transport coefficients. We present an alternative approach, called the \emph{transient subtraction technique} (inspired by early work by Ciccotti and Jaccucci in 1975), which amounts to simulating a transient dynamics, from which we subtract a sensibly coupled equilibrium trajectory, resulting in an estimator with smaller variance. We present the mathematical formulation of the transient subtraction technique, give error estimates on the bias and variance of the associated estimator, and demonstrate the relevance of the method through numerical illustrations for various systems.