Among randomized numerical linear algebra strategies, so-called sketching procedures are emerging as effective reduction means to accelerate the computation of Krylov subspace methods for, e.g., the solution of linear systems, eigenvalue computations, and the approximation of matrix functions. While there is plenty of experimental evidence showing that sketched Krylov solvers may dramatically improve performance over standard Krylov methods, many features of these schemes are still unexplored. We derive new theoretical results that allow us to significantly improve our understanding of sketched Krylov methods, and to identify, among several possible equivalent formulations, the most suitable sketched approximations according to their numerical stability properties. These results are also employed to analyze the error of sketched Krylov methods in the approximation of the action of matrix functions, significantly contributing to the theory available in the current literature.
Sequential algorithms such as sequential importance sampling (SIS) and sequential Monte Carlo (SMC) have proven fundamental in Bayesian inference for models not admitting a readily available likelihood function. For approximate Bayesian computation (ABC), SMC-ABC is the state-of-art sampler. However, since the ABC paradigm is intrinsically wasteful, sequential ABC schemes can benefit from well-targeted proposal samplers that efficiently avoid improbable parameter regions. We contribute to the ABC modeller's toolbox with novel proposal samplers that are conditional to summary statistics of the data. In a sense, the proposed parameters are "guided" to rapidly reach regions of the posterior surface that are compatible with the observed data. This speeds up the convergence of these sequential samplers, thus reducing the computational effort, while preserving the accuracy in the inference. We provide a variety of guided Gaussian and copula-based samplers for both SIS-ABC and SMC-ABC easing inference for challenging case-studies, including multimodal posteriors, highly correlated posteriors, hierarchical models with high-dimensional summary statistics (180 summaries used to infer 21 parameters) and a simulation study of cell movements (using more than 400 summaries).
The aim of this work is to present a model reduction technique in the framework of optimal control problems for partial differential equations. We combine two approaches used for reducing the computational cost of the mathematical numerical models: domain-decomposition (DD) methods and reduced-order modelling (ROM). In particular, we consider an optimisation-based domain-decomposition algorithm for the parameter-dependent stationary incompressible Navier-Stokes equations. Firstly, the problem is described on the subdomains coupled at the interface and solved through an optimal control problem, which leads to the complete separation of the subdomain problems in the DD method. On top of that, a reduced model for the obtained optimal-control problem is built; the procedure is based on the Proper Orthogonal Decomposition technique and a further Galerkin projection. The presented methodology is tested on two fluid dynamics benchmarks: the stationary backward-facing step and lid-driven cavity flow. The numerical tests show a significant reduction of the computational costs in terms of both the problem dimensions and the number of optimisation iterations in the domain-decomposition algorithm.
Mesh-based simulations play a key role when modeling complex physical systems that, in many disciplines across science and engineering, require the solution of parametrized time-dependent nonlinear partial differential equations (PDEs). In this context, full order models (FOMs), such as those relying on the finite element method, can reach high levels of accuracy, however often yielding intensive simulations to run. For this reason, surrogate models are developed to replace computationally expensive solvers with more efficient ones, which can strike favorable trade-offs between accuracy and efficiency. This work explores the potential usage of graph neural networks (GNNs) for the simulation of time-dependent PDEs in the presence of geometrical variability. In particular, we propose a systematic strategy to build surrogate models based on a data-driven time-stepping scheme where a GNN architecture is used to efficiently evolve the system. With respect to the majority of surrogate models, the proposed approach stands out for its ability of tackling problems with parameter dependent spatial domains, while simultaneously generalizing to different geometries and mesh resolutions. We assess the effectiveness of the proposed approach through a series of numerical experiments, involving both two- and three-dimensional problems, showing that GNNs can provide a valid alternative to traditional surrogate models in terms of computational efficiency and generalization to new scenarios. We also assess, from a numerical standpoint, the importance of using GNNs, rather than classical dense deep neural networks, for the proposed framework.
The EM algorithm is a popular tool for maximum likelihood estimation but has not been used much for high-dimensional regularization problems in linear mixed-effects models. In this paper, we introduce the EMLMLasso algorithm, which combines the EM algorithm and the popular and efficient R package glmnet for Lasso variable selection of fixed effects in linear mixed-effects models. We compare the performance of our proposed EMLMLasso algorithm with the one implemented in the well-known R package glmmLasso through the analyses of both simulated and real-world applications. The simulations and applications demonstrated good properties, such as consistency, and the effectiveness of the proposed variable selection procedure, for both $p < n$ and $p > n$. Moreover, in all evaluated scenarios, the EMLMLasso algorithm outperformed glmmLasso. The proposed method is quite general and can be easily extended for ridge and elastic net penalties in linear mixed-effects models.
We propose a new discrete choice model, called the generalized stochastic preference (GSP) model, that incorporates non-rationality into the stochastic preference (SP) choice model, also known as the rank- based choice model. Our model can explain several choice phenomena that cannot be represented by any SP model such as the compromise and attraction effects, but still subsumes the SP model class. The GSP model is defined as a distribution over consumer types, where each type extends the choice behavior of rational types in the SP model. We build on existing methods for estimating the SP model and propose an iterative estimation algorithm for the GSP model that finds new types by solving a integer linear program in each iteration. We further show that our proposed notion of non-rationality can be incorporated into other choice models, like the random utility maximization (RUM) model class as well as any of its subclasses. As a concrete example, we introduce the non-rational extension of the classical MNL model, which we term the generalized MNL (GMNL) model and present an efficient expectation-maximization (EM) algorithm for estimating the GMNL model. Numerical evaluation on real choice data shows that the GMNL and GSP models can outperform their rational counterparts in out-of-sample prediction accuracy.
We provide a framework for the numerical approximation of distributed optimal control problems, based on least-squares finite element methods. Our proposed method simultaneously solves the state and adjoint equations and is $\inf$--$\sup$ stable for any choice of conforming discretization spaces. A reliable and efficient a posteriori error estimator is derived for problems where box constraints are imposed on the control. It can be localized and therefore used to steer an adaptive algorithm. For unconstrained optimal control problems, i.e., the set of controls being a Hilbert space, we obtain a coercive least-squares method and, in particular, quasi-optimality for any choice of discrete approximation space. For constrained problems we derive and analyze a variational inequality where the PDE part is tackled by least-squares finite element methods. We show that the abstract framework can be applied to a wide range of problems, including scalar second-order PDEs, the Stokes problem, and parabolic problems on space-time domains. Numerical examples for some selected problems are presented.
We propose a novel stochastic algorithm that randomly samples entire rows and columns of the matrix as a way to approximate an arbitrary matrix function. This contrasts with the "classical" Monte Carlo method which only works with one entry at a time, resulting in a significant better convergence rate than the "classical" approach. To assess the applicability of our method, we compute the subgraph centrality and total communicability of several large networks. In all benchmarks analyzed so far, the performance of our method was significantly superior to the competition, being able to scale up to 64 CPU cores with a remarkable efficiency.
High-dimensional Partial Differential Equations (PDEs) are a popular mathematical modelling tool, with applications ranging from finance to computational chemistry. However, standard numerical techniques for solving these PDEs are typically affected by the curse of dimensionality. In this work, we tackle this challenge while focusing on stationary diffusion equations defined over a high-dimensional domain with periodic boundary conditions. Inspired by recent progress in sparse function approximation in high dimensions, we propose a new method called compressive Fourier collocation. Combining ideas from compressive sensing and spectral collocation, our method replaces the use of structured collocation grids with Monte Carlo sampling and employs sparse recovery techniques, such as orthogonal matching pursuit and $\ell^1$ minimization, to approximate the Fourier coefficients of the PDE solution. We conduct a rigorous theoretical analysis showing that the approximation error of the proposed method is comparable with the best $s$-term approximation (with respect to the Fourier basis) to the solution. Using the recently introduced framework of random sampling in bounded Riesz systems, our analysis shows that the compressive Fourier collocation method mitigates the curse of dimensionality with respect to the number of collocation points under sufficient conditions on the regularity of the diffusion coefficient. We also present numerical experiments that illustrate the accuracy and stability of the method for the approximation of sparse and compressible solutions.
We develop a numerical method for computing with orthogonal polynomials that are orthogonal on multiple, disjoint intervals for which analytical formulae are currently unknown. Our approach exploits the Fokas--Its--Kitaev Riemann--Hilbert representation of the orthogonal polynomials to produce an $\text{O}(N)$ method to compute the first $N$ recurrence coefficients. The method can also be used for pointwise evaluation of the polynomials and their Cauchy transforms throughout the complex plane. The method encodes the singularity behavior of weight functions using weighted Cauchy integrals of Chebyshev polynomials. This greatly improves the efficiency of the method, outperforming other available techniques. We demonstrate the fast convergence of our method and present applications to integrable systems and approximation theory.
Developing an efficient computational scheme for high-dimensional Bayesian variable selection in generalised linear models and survival models has always been a challenging problem due to the absence of closed-form solutions for the marginal likelihood. The RJMCMC approach can be employed to samples model and coefficients jointly, but effective design of the transdimensional jumps of RJMCMC can be challenge, making it hard to implement. Alternatively, the marginal likelihood can be derived using data-augmentation scheme e.g. Polya-gamma data argumentation for logistic regression) or through other estimation methods. However, suitable data-augmentation schemes are not available for every generalised linear and survival models, and using estimations such as Laplace approximation or correlated pseudo-marginal to derive marginal likelihood within a locally informed proposal can be computationally expensive in the "large n, large p" settings. In this paper, three main contributions are presented. Firstly, we present an extended Point-wise implementation of Adaptive Random Neighbourhood Informed proposal (PARNI) to efficiently sample models directly from the marginal posterior distribution in both generalised linear models and survival models. Secondly, in the light of the approximate Laplace approximation, we also describe an efficient and accurate estimation method for the marginal likelihood which involves adaptive parameters. Additionally, we describe a new method to adapt the algorithmic tuning parameters of the PARNI proposal by replacing the Rao-Blackwellised estimates with the combination of a warm-start estimate and an ergodic average. We present numerous numerical results from simulated data and 8 high-dimensional gene fine mapping data-sets to showcase the efficiency of the novel PARNI proposal compared to the baseline add-delete-swap proposal.