We investigate a class of models for non-parametric estimation of probability density fields based on scattered samples of heterogeneous sizes. The considered SLGP models are Spatial extensions of Logistic Gaussian Process models and inherit some of their theoretical properties but also of their computational challenges. We introduce SLGPs from the perspective of random measures and their densities, and investigate links between properties of SLGPs and underlying processes. Our inquiries are motivated by SLGP's abilities to deliver probabilistic predictions of conditional distributions at candidate points, to allow (approximate) conditional simulations of probability densities, and to jointly predict multiple functionals of target distributions. We show that SLGP models induced by continuous GPs can be characterized by the joint Gaussianity of their log-increments and leverage this characterization to establish theoretical results pertaining to spatial regularity. We extend the notion of mean-square continuity to random measure fields and establish sufficient conditions on covariance kernels underlying SLGPs for associated models to enjoy such regularity properties. From the practical side, we propose an implementation relying on Random Fourier Features and demonstrate its applicability on synthetic examples and on temperature distributions at meteorological stations, including probabilistic predictions of densities at left-out stations.
Identification of nonlinear dynamical systems has been popularized by sparse identification of the nonlinear dynamics (SINDy) via the sequentially thresholded least squares (STLS) algorithm. Many extensions SINDy have emerged in the literature to deal with experimental data which are finite in length and noisy. Recently, the computationally intensive method of ensembling bootstrapped SINDy models (E-SINDy) was proposed for model identification, handling finite, highly noisy data. While the extensions of SINDy are numerous, their sparsity-promoting estimators occasionally provide sparse approximations of the dynamics as opposed to exact recovery. Furthermore, these estimators suffer under multicollinearity, e.g. the irrepresentable condition for the Lasso. In this paper, we demonstrate that the Trimmed Lasso for robust identification of models (TRIM) can provide exact recovery under more severe noise, finite data, and multicollinearity as opposed to E-SINDy. Additionally, the computational cost of TRIM is asymptotically equal to STLS since the sparsity parameter of the TRIM can be solved efficiently by convex solvers. We compare these methodologies on challenging nonlinear systems, specifically the Lorenz 63 system, the Bouc Wen oscillator from the nonlinear dynamics benchmark of No\"el and Schoukens, 2016, and a time delay system describing tool cutting dynamics. This study emphasizes the comparisons between STLS, reweighted $\ell_1$ minimization, and Trimmed Lasso in identification with respect to problems faced by practitioners: the problem of finite and noisy data, the performance of the sparse regression of when the library grows in dimension (multicollinearity), and automatic methods for choice of regularization parameters.
We discuss applications of exact structures and relative homological algebra to the study of invariants of multiparameter persistence modules. This paper is mostly expository, but does contain a pair of novel results. Over finite posets, classical arguments about the relative projective modules of an exact structure make use of Auslander-Reiten theory. One of our results establishes a new adjunction which allows us to "lift" these arguments to certain infinite posets over which Auslander-Reiten theory is not available. We give several examples of this lifting, in particular highlighting the non-existence and existence of resolutions by upsets when working with finitely presentable representations of the plane and of the closure of the positive quadrant, respectively. We then restrict our attention to finite posets. In this setting, we discuss the relationship between the global dimension of an exact structure and the representation dimension of the incidence algebra of the poset. We conclude with our second novel contribution. This is an explicit description of the irreducible morphisms between relative projective modules for several exact structures which have appeared previously in the literature.
We propose an automated nonlinear model reduction and mesh adaptation framework for rapid and reliable solution of parameterized advection-dominated problems, with emphasis on compressible flows. The key features of our approach are threefold: (i) a metric-based mesh adaptation technique to generate an accurate mesh for a range of parameters, (ii) a general (i.e., independent of the underlying equations) registration procedure for the computation of a mapping $\Phi$ that tracks moving features of the solution field, and (iii) an hyper-reduced least-square Petrov-Galerkin reduced-order model for the rapid and reliable estimation of the mapped solution. We discuss a general paradigm -- which mimics the refinement loop considered in mesh adaptation -- to simultaneously construct the high-fidelity and the reduced-order approximations, and we discuss actionable strategies to accelerate the offline phase. We present extensive numerical investigations for a quasi-1D nozzle problem and for a two-dimensional inviscid flow past a Gaussian bump to display the many features of the methodology and to assess the performance for problems with discontinuous solutions.
The aim of this work is to present a model reduction technique in the framework of optimal control problems for partial differential equations. We combine two approaches used for reducing the computational cost of the mathematical numerical models: domain-decomposition (DD) methods and reduced-order modelling (ROM). In particular, we consider an optimisation-based domain-decomposition algorithm for the parameter-dependent stationary incompressible Navier-Stokes equations. Firstly, the problem is described on the subdomains coupled at the interface and solved through an optimal control problem, which leads to the complete separation of the subdomain problems in the DD method. On top of that, a reduced model for the obtained optimal-control problem is built; the procedure is based on the Proper Orthogonal Decomposition technique and a further Galerkin projection. The presented methodology is tested on two fluid dynamics benchmarks: the stationary backward-facing step and lid-driven cavity flow. The numerical tests show a significant reduction of the computational costs in terms of both the problem dimensions and the number of optimisation iterations in the domain-decomposition algorithm.
Linear regression and classification models with repeated functional data are considered. For each statistical unit in the sample, a real-valued parameter is observed over time under different conditions. Two regression models based on fusion penalties are presented. The first one is a generalization of the variable fusion model based on the 1-nearest neighbor. The second one, called group fusion lasso, assumes some grouping structure of conditions and allows for homogeneity among the regression coefficient functions within groups. A finite sample numerical simulation and an application on EEG data are presented.
We propose a new discrete choice model, called the generalized stochastic preference (GSP) model, that incorporates non-rationality into the stochastic preference (SP) choice model, also known as the rank- based choice model. Our model can explain several choice phenomena that cannot be represented by any SP model such as the compromise and attraction effects, but still subsumes the SP model class. The GSP model is defined as a distribution over consumer types, where each type extends the choice behavior of rational types in the SP model. We build on existing methods for estimating the SP model and propose an iterative estimation algorithm for the GSP model that finds new types by solving a integer linear program in each iteration. We further show that our proposed notion of non-rationality can be incorporated into other choice models, like the random utility maximization (RUM) model class as well as any of its subclasses. As a concrete example, we introduce the non-rational extension of the classical MNL model, which we term the generalized MNL (GMNL) model and present an efficient expectation-maximization (EM) algorithm for estimating the GMNL model. Numerical evaluation on real choice data shows that the GMNL and GSP models can outperform their rational counterparts in out-of-sample prediction accuracy.
We provide a framework for the numerical approximation of distributed optimal control problems, based on least-squares finite element methods. Our proposed method simultaneously solves the state and adjoint equations and is $\inf$--$\sup$ stable for any choice of conforming discretization spaces. A reliable and efficient a posteriori error estimator is derived for problems where box constraints are imposed on the control. It can be localized and therefore used to steer an adaptive algorithm. For unconstrained optimal control problems, i.e., the set of controls being a Hilbert space, we obtain a coercive least-squares method and, in particular, quasi-optimality for any choice of discrete approximation space. For constrained problems we derive and analyze a variational inequality where the PDE part is tackled by least-squares finite element methods. We show that the abstract framework can be applied to a wide range of problems, including scalar second-order PDEs, the Stokes problem, and parabolic problems on space-time domains. Numerical examples for some selected problems are presented.
We consider two-phase fluid deformable surfaces as model systems for biomembranes. Such surfaces are modeled by incompressible surface Navier-Stokes-Cahn-Hilliard-like equations with bending forces. We derive this model using the Lagrange-D'Alembert principle considering various dissipation mechanisms. The highly nonlinear model is solved numerically to explore the tight interplay between surface evolution, surface phase composition, surface curvature and surface hydrodynamics. It is demonstrated that hydrodynamics can enhance bulging and furrow formation, which both can further develop to pinch-offs. The numerical approach builds on a Taylor-Hood element for the surface Navier-Stokes part, a semi-implicit approach for the Cahn-Hilliard part, higher order surface parametrizations, appropriate approximations of the geometric quantities, and mesh redistribution. We demonstrate convergence properties that are known to be optimal for simplified sub-problems.
Developing an efficient computational scheme for high-dimensional Bayesian variable selection in generalised linear models and survival models has always been a challenging problem due to the absence of closed-form solutions for the marginal likelihood. The RJMCMC approach can be employed to samples model and coefficients jointly, but effective design of the transdimensional jumps of RJMCMC can be challenge, making it hard to implement. Alternatively, the marginal likelihood can be derived using data-augmentation scheme e.g. Polya-gamma data argumentation for logistic regression) or through other estimation methods. However, suitable data-augmentation schemes are not available for every generalised linear and survival models, and using estimations such as Laplace approximation or correlated pseudo-marginal to derive marginal likelihood within a locally informed proposal can be computationally expensive in the "large n, large p" settings. In this paper, three main contributions are presented. Firstly, we present an extended Point-wise implementation of Adaptive Random Neighbourhood Informed proposal (PARNI) to efficiently sample models directly from the marginal posterior distribution in both generalised linear models and survival models. Secondly, in the light of the approximate Laplace approximation, we also describe an efficient and accurate estimation method for the marginal likelihood which involves adaptive parameters. Additionally, we describe a new method to adapt the algorithmic tuning parameters of the PARNI proposal by replacing the Rao-Blackwellised estimates with the combination of a warm-start estimate and an ergodic average. We present numerous numerical results from simulated data and 8 high-dimensional gene fine mapping data-sets to showcase the efficiency of the novel PARNI proposal compared to the baseline add-delete-swap proposal.
We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.