We study some properties of a multi-species degenerate Ginzburg-Landau energy and its relation to a cross-diffusion Cahn-Hilliard system. The model is motivated by multicomponent mixtures where crossdiffusion effects between the different species are taken into account, and where only one species does separate from the others. Using a comparison argument, we obtain strict bounds on the minimizers from which we can derive first-order optimality conditions, revealing a link with the single-species energy, and providing enough regularity to qualify the minimizers as stationary solutions of the evolution system. We also discuss convexity properties of the energy as well as long time asymptotics of the time-dependent problem. Lastly, we introduce a structure-preserving finite volume scheme for the time-dependent problem and present several numerical experiments in one and two spatial dimensions.
We introduce a new stochastic algorithm for solving entropic optimal transport (EOT) between two absolutely continuous probability measures $\mu$ and $\nu$. Our work is motivated by the specific setting of Monge-Kantorovich quantiles where the source measure $\mu$ is either the uniform distribution on the unit hypercube or the spherical uniform distribution. Using the knowledge of the source measure, we propose to parametrize a Kantorovich dual potential by its Fourier coefficients. In this way, each iteration of our stochastic algorithm reduces to two Fourier transforms that enables us to make use of the Fast Fourier Transform (FFT) in order to implement a fast numerical method to solve EOT. We study the almost sure convergence of our stochastic algorithm that takes its values in an infinite-dimensional Banach space. Then, using numerical experiments, we illustrate the performances of our approach on the computation of regularized Monge-Kantorovich quantiles. In particular, we investigate the potential benefits of entropic regularization for the smooth estimation of multivariate quantiles using data sampled from the target measure $\nu$.
The macro-element variant of the hybridized discontinuous Galerkin (HDG) method combines advantages of continuous and discontinuous finite element discretization. In this paper, we investigate the performance of the macro-element HDG method for the analysis of compressible flow problems at moderate Reynolds numbers. To efficiently handle the corresponding large systems of equations, we explore several strategies at the solver level. On the one hand, we devise a second-layer static condensation approach that reduces the size of the local system matrix in each macro-element and hence the factorization time of the local solver. On the other hand, we employ a multi-level preconditioner based on the FGMRES solver for the global system that integrates well within a matrix-free implementation. In addition, we integrate a standard diagonally implicit Runge-Kutta scheme for time integration. We test the matrix-free macro-element HDG method for compressible flow benchmarks, including Couette flow, flow past a sphere, and the Taylor-Green vortex. Our results show that unlike standard HDG, the macro-element HDG method can operate efficiently for moderate polynomial degrees, as the local computational load can be flexibly increased via mesh refinement within a macro-element. Our results also show that due to the balance of local and global operations, the reduction in degrees of freedom, and the reduction of the global problem size and the number of iterations for its solution, the macro-element HDG method can be a competitive option for the analysis of compressible flow problems.
In this article we aim to obtain the Fisher Riemann geodesics for nonparametric families of probability densities as a weak limit of the parametric case with increasing number of parameters.
Continual learning algorithms strive to acquire new knowledge while preserving prior information. Often, these algorithms emphasise stability and restrict network updates upon learning new tasks. In many cases, such restrictions come at a cost to the model's plasticity, i.e. the model's ability to adapt to the requirements of a new task. But is all change detrimental? Here, we approach this question by proposing that activation spaces in neural networks can be decomposed into two subspaces: a readout range in which change affects prior tasks and a null space in which change does not alter prior performance. Based on experiments with this novel technique, we show that, indeed, not all activation change is associated with forgetting. Instead, the only change in the subspace visible to the readout of a task can lead to decreased stability, while restricting change outside of this subspace is associated only with a loss of plasticity. Analysing various commonly used algorithms, we show that regularisation-based techniques do not fully disentangle the two spaces and, as a result, restrict plasticity more than need be. We expand our results by investigating a linear model in which we can manipulate learning in the two subspaces directly and thus causally link activation changes to stability and plasticity. For hierarchical, nonlinear cases, we present an approximation that enables us to estimate functionally relevant subspaces at every layer of a deep nonlinear network, corroborating our previous insights. Together, this work provides novel means to derive insights into the mechanisms behind stability and plasticity in continual learning and may serve as a diagnostic tool to guide developments of future continual learning algorithms that stabilise inference while allowing maximal space for learning.
The current study investigates the asymptotic spectral properties of a finite difference approximation of nonlocal Helmholtz equations with a Caputo fractional Laplacian and a variable coefficient wave number $\mu$, as it occurs when considering a wave propagation in complex media, characterized by nonlocal interactions and spatially varying wave speeds. More specifically, by using tools from Toeplitz and generalized locally Toeplitz theory, the present research delves into the spectral analysis of nonpreconditioned and preconditioned matrix-sequences. We report numerical evidences supporting the theoretical findings. Finally, open problems and potential extensions in various directions are presented and briefly discussed.
We introduce QUICK, a group of novel optimized CUDA kernels for the efficient inference of quantized Large Language Models (LLMs). QUICK addresses the shared memory bank-conflict problem of state-of-the-art mixed precision matrix multiplication kernels. Our method interleaves the quantized weight matrices of LLMs offline to skip the shared memory write-back after the dequantization. We demonstrate up to 1.91x speedup over existing kernels of AutoAWQ on larger batches and up to 1.94x throughput gain on representative LLM models on various NVIDIA GPU devices.
We introduce Optimistix: a nonlinear optimisation library built in JAX and Equinox. Optimistix introduces a novel, modular approach for its minimisers and least-squares solvers. This modularity relies on new practical abstractions for optimisation which we call search and descent, and which generalise classical notions of line search, trust-region, and learning-rate algorithms. It provides high-level APIs and solvers for minimisation, nonlinear least-squares, root-finding, and fixed-point iteration. Optimistix is available at //github.com/patrick-kidger/optimistix.
The broad class of multivariate unified skew-normal (SUN) distributions has been recently shown to possess fundamental conjugacy properties. When used as priors for the vector of parameters in general probit, tobit, and multinomial probit models, these distributions yield posteriors that still belong to the SUN family. Although such a core result has led to important advancements in Bayesian inference and computation, its applicability beyond likelihoods associated with fully-observed, discretized, or censored realizations from multivariate Gaussian models remains yet unexplored. This article covers such an important gap by proving that the wider family of multivariate unified skew-elliptical (SUE) distributions, which extends SUNs to more general perturbations of elliptical densities, guarantees conjugacy for broader classes of models, beyond those relying on fully-observed, discretized or censored Gaussians. Such a result leverages the closure under linear combinations, conditioning and marginalization of SUE to prove that such a family is conjugate to the likelihood induced by general multivariate regression models for fully-observed, censored or dichotomized realizations from skew-elliptical distributions. This advancement substantially enlarges the set of models that enable conjugate Bayesian inference to general formulations arising from elliptical and skew-elliptical families, including the multivariate Student's t and skew-t, among others.
Sequential Bayesian Filtering aims to estimate the current state distribution of a Hidden Markov Model, given the past observations. The problem is well-known to be intractable for most application domains, except in notable cases such as the tabular setting or for linear dynamical systems with gaussian noise. In this work, we propose a new class of filters based on Gaussian PSD Models, which offer several advantages in terms of density approximation and computational efficiency. We show that filtering can be efficiently performed in closed form when transitions and observations are Gaussian PSD Models. When the transition and observations are approximated by Gaussian PSD Models, we show that our proposed estimator enjoys strong theoretical guarantees, with estimation error that depends on the quality of the approximation and is adaptive to the regularity of the transition probabilities. In particular, we identify regimes in which our proposed filter attains a TV $\epsilon$-error with memory and computational complexity of $O(\epsilon^{-1})$ and $O(\epsilon^{-3/2})$ respectively, including the offline learning step, in contrast to the $O(\epsilon^{-2})$ complexity of sampling methods such as particle filtering.
We give a fully polynomial-time randomized approximation scheme (FPRAS) for two terminal reliability in directed acyclic graphs (DAGs). In contrast, we also show the complementing problem of approximating two terminal unreliability in DAGs is #BIS-hard.