Selective inference methods are developed for group lasso estimators for use with a wide class of distributions and loss functions. The method includes the use of exponential family distributions, as well as quasi-likelihood modeling for overdispersed count data, for example, and allows for categorical or grouped covariates as well as continuous covariates. A randomized group-regularized optimization problem is studied. The added randomization allows us to construct a post-selection likelihood which we show to be adequate for selective inference when conditioning on the event of the selection of the grouped covariates. This likelihood also provides a selective point estimator, accounting for the selection by the group lasso. Confidence regions for the regression parameters in the selected model take the form of Wald-type regions and are shown to have bounded volume. The selective inference method for grouped lasso is illustrated on data from the national health and nutrition examination survey while simulations showcase its behaviour and favorable comparison with other methods.
The stability of nonlinear waves on spatially extended domains is commonly probed by computing the spectrum of the linearization of the underlying PDE about the wave profile. It is known that convective transport, whether driven by the nonlinear pattern itself or an underlying fluid flow, can cause exponential growth of the resolvent of the linearization as a function of the domain length. In particular, sparse eigenvalue algorithms may result in inaccurate and spurious spectra in the convective regime. In this work, we focus on spiral waves, which arise in many natural processes and which exhibit convective transport. We prove that exponential weights can serve as effective, inexpensive preconditioners that result in resolvents that are uniformly bounded in the domain size and that stabilize numerical spectral computations. We also show that the optimal exponential rates can be computed reliably from a simpler asymptotic problem posed in one space dimension.
This paper presents a learnable solver tailored to iteratively solve sparse linear systems from discretized partial differential equations (PDEs). Unlike traditional approaches relying on specialized expertise, our solver streamlines the algorithm design process for a class of PDEs through training, which requires only training data of coefficient distributions. The proposed method is anchored by three core principles: (1) a multilevel hierarchy to promote rapid convergence, (2) adherence to linearity concerning the right-hand-side of equations, and (3) weights sharing across different levels to facilitate adaptability to various problem sizes. Built on these foundational principles and considering the similar computation pattern of the convolutional neural network (CNN) as multigrid components, we introduce a network adept at solving linear systems from PDEs with heterogeneous coefficients, discretized on structured grids. Notably, our proposed solver possesses the ability to generalize over right-hand-side terms, PDE coefficients, and grid sizes, thereby ensuring its training is purely offline. To evaluate its effectiveness, we train the solver on convection-diffusion equations featuring heterogeneous diffusion coefficients. The solver exhibits swift convergence to high accuracy over a range of grid sizes, extending from $31 \times 31$ to $4095 \times 4095$. Remarkably, our method outperforms the classical Geometric Multigrid (GMG) solver, demonstrating a speedup of approximately 3 to 8 times. Furthermore, our numerical investigation into the solver's capacity to generalize to untrained coefficient distributions reveals promising outcomes.
We develop further the theory of monoidal bicategories by introducing and studying bicate- gorical counterparts of the notions of a linear explonential comonad, as considered in the study of linear logic, and of a codereliction transformation, introduced to study differential linear logic via differential categories. As an application, we extend the differential calculus of Joyal's analytic functors to analytic functors between presheaf categories, just as ordinary calculus extends from a single variable to many variables.
We propose a novel score-based particle method for solving the Landau equation in plasmas, that seamlessly integrates learning with structure-preserving particle methods [arXiv:1910.03080]. Building upon the Lagrangian viewpoint of the Landau equation, a central challenge stems from the nonlinear dependence of the velocity field on the density. Our primary innovation lies in recognizing that this nonlinearity is in the form of the score function, which can be approximated dynamically via techniques from score-matching. The resulting method inherits the conservation properties of the deterministic particle method while sidestepping the necessity for kernel density estimation in [arXiv:1910.03080]. This streamlines computation and enhances scalability with dimensionality. Furthermore, we provide a theoretical estimate by demonstrating that the KL divergence between our approximation and the true solution can be effectively controlled by the score-matching loss. Additionally, by adopting the flow map viewpoint, we derive an update formula for exact density computation. Extensive examples have been provided to show the efficiency of the method, including a physically relevant case of Coulomb interaction.
In smoothed particle hydrodynamics (SPH) method, the particle-based approximations are implemented via kernel functions, and the evaluation of performance involves two key criteria: numerical accuracy and computational efficiency. In the SPH community, the Wendland kernel reigns as the prevailing choice due to its commendable accuracy and reasonable computational efficiency. Nevertheless, there exists an urgent need to enhance the computational efficiency of numerical methods while upholding accuracy. In this paper, we employ a truncation approach to limit the compact support of the Wendland kernel to 1.6h. This decision is based on the observation that particles within the range of 1.6h to 2h make negligible contributions, practically approaching zero, to the SPH approximation. To address integration errors stemming from the truncation, we incorporate the Laguerre-Gauss kernel for particle relaxation due to the fact that this kernel has been demonstrated to enable the attainment of particle distributions with reduced residue and integration errors \cite{wang2023fourth}. Furthermore, we introduce the kernel gradient correction to rectify numerical errors from the SPH approximation of kernel gradient and the truncated compact support. A comprehensive set of numerical examples including fluid dynamics in Eulerian formulation and solid dynamics in total Lagrangian formulation are tested and have demonstrated that truncated and standard Wendland kernels enable achieve the same level accuracy but the former significantly increase the computational efficiency.
This paper presents an analysis of properties of two hybrid discretization methods for Gaussian derivatives, based on convolutions with either the normalized sampled Gaussian kernel or the integrated Gaussian kernel followed by central differences. The motivation for studying these discretization methods is that in situations when multiple spatial derivatives of different order are needed at the same scale level, they can be computed significantly more efficiently compared to more direct derivative approximations based on explicit convolutions with either sampled Gaussian kernels or integrated Gaussian kernels. While these computational benefits do also hold for the genuinely discrete approach for computing discrete analogues of Gaussian derivatives, based on convolution with the discrete analogue of the Gaussian kernel followed by central differences, the underlying mathematical primitives for the discrete analogue of the Gaussian kernel, in terms of modified Bessel functions of integer order, may not be available in certain frameworks for image processing, such as when performing deep learning based on scale-parameterized filters in terms of Gaussian derivatives, with learning of the scale levels. In this paper, we present a characterization of the properties of these hybrid discretization methods, in terms of quantitative performance measures concerning the amount of spatial smoothing that they imply, as well as the relative consistency of scale estimates obtained from scale-invariant feature detectors with automatic scale selection, with an emphasis on the behaviour for very small values of the scale parameter, which may differ significantly from corresponding results obtained from the fully continuous scale-space theory, as well as between different types of discretization methods.
Adaptive finite elements combined with geometric multigrid solvers are one of the most efficient numerical methods for problems such as the instationary Navier-Stokes equations. Yet despite their efficiency, computations remain expensive and the simulation of, for example, complex flow problems can take many hours or days. GPUs provide an interesting avenue to speed up the calculations due to their very large theoretical peak performance. However, the large degree of parallelism and non-standard API make the use of GPUs in scientific computing challenging. In this work, we develop a GPU acceleration for the adaptive finite element library Gascoigne and study its effectiveness for different systems of partial differential equations. Through the systematic formulation of all computations as linear algebra operations, we can employ GPU-accelerated linear algebra libraries, which simplifies the implementation and ensures the maintainability of the code while achieving very efficient GPU utilizations. Our results for a transport-diffusion equation, linear elasticity, and the instationary Navier-Stokes equations show substantial speedups of up to 20X compared to multi-core CPU implementations.
Randomized iterative methods, such as the Kaczmarz method and its variants, have gained growing attention due to their simplicity and efficiency in solving large-scale linear systems. Meanwhile, absolute value equations (AVE) have attracted increasing interest due to their connection with the linear complementarity problem. In this paper, we investigate the application of randomized iterative methods to generalized AVE (GAVE). Our approach differs from most existing works in that we tackle GAVE with non-square coefficient matrices. We establish more comprehensive sufficient and necessary conditions for characterizing the solvability of GAVE and propose precise error bound conditions. Furthermore, we introduce a flexible and efficient randomized iterative algorithmic framework for solving GAVE, which employs sampling matrices drawn from user-specified distributions. This framework is capable of encompassing many well-known methods, including the Picard iteration method and the randomized Kaczmarz method. Leveraging our findings on solvability and error bounds, we establish both almost sure convergence and linear convergence rates for this versatile algorithmic framework. Finally, we present numerical examples to illustrate the advantages of the new algorithms.
Permutation invariance is among the most common symmetry that can be exploited to simplify complex problems in machine learning (ML). There has been a tremendous surge of research activities in building permutation invariant ML architectures. However, less attention is given to: (1) how to statistically test for permutation invariance of coordinates in a random vector where the dimension is allowed to grow with the sample size; (2) how to leverage permutation invariance in estimation problems and how does it help reduce dimensions. In this paper, we take a step back and examine these questions in several fundamental problems: (i) testing the assumption of permutation invariance of multivariate distributions; (ii) estimating permutation invariant densities; (iii) analyzing the metric entropy of permutation invariant function classes and compare them with their counterparts without imposing permutation invariance; (iv) deriving an embedding of permutation invariant reproducing kernel Hilbert spaces for efficient computation. In particular, our methods for (i) and (iv) are based on a sorting trick and (ii) is based on an averaging trick. These tricks substantially simplify the exploitation of permutation invariance.
This manuscript derives locally weighted ensemble Kalman methods from the point of view of ensemble-based function approximation. This is done by using pointwise evaluations to build up a local linear or quadratic approximation of a function, tapering off the effect of distant particles via local weighting. This introduces a candidate method (the locally weighted Ensemble Kalman method for inversion) with the motivation of combining some of the strengths of the particle filter (ability to cope with nonlinear maps and non-Gaussian distributions) and the Ensemble Kalman filter (no filter degeneracy).