This work introduces a reduced order modeling (ROM) framework for the solution of parameterized second-order linear elliptic partial differential equations formulated on unfitted geometries. The goal is to construct efficient projection-based ROMs, which rely on techniques such as the reduced basis method and discrete empirical interpolation. The presence of geometrical parameters in unfitted domain discretizations entails challenges for the application of standard ROMs. Therefore, in this work we propose a methodology based on i) extension of snapshots on the background mesh and ii) localization strategies to decrease the number of reduced basis functions. The method we obtain is computationally efficient and accurate, while it is agnostic with respect to the underlying discretization choice. We test the applicability of the proposed framework with numerical experiments on two model problems, namely the Poisson and linear elasticity problems. In particular, we study several benchmarks formulated on two-dimensional, trimmed domains discretized with splines and we observe a significant reduction of the online computational cost compared to standard ROMs for the same level of accuracy. Moreover, we show the applicability of our methodology to a three-dimensional geometry of a linear elastic problem.
We propose derivative-informed neural operators (DINOs), a general family of neural networks to approximate operators as infinite-dimensional mappings from input function spaces to output function spaces or quantities of interest. After discretizations both inputs and outputs are high-dimensional. We aim to approximate not only the operators with improved accuracy but also their derivatives (Jacobians) with respect to the input function-valued parameter to empower derivative-based algorithms in many applications, e.g., Bayesian inverse problems, optimization under parameter uncertainty, and optimal experimental design. The major difficulties include the computational cost of generating derivative training data and the high dimensionality of the problem leading to large training cost. To address these challenges, we exploit the intrinsic low-dimensionality of the derivatives and develop algorithms for compressing derivative information and efficiently imposing it in neural operator training yielding derivative-informed neural operators. We demonstrate that these advances can significantly reduce the costs of both data generation and training for large classes of problems (e.g., nonlinear steady state parametric PDE maps), making the costs marginal or comparable to the costs without using derivatives, and in particular independent of the discretization dimension of the input and output functions. Moreover, we show that the proposed DINO achieves significantly higher accuracy than neural operators trained without derivative information, for both function approximation and derivative approximation (e.g., Gauss-Newton Hessian), especially when the training data are limited.
An implicit Euler finite-volume scheme for a nonlocal cross-diffusion system on the one-dimensional torus, arising in population dynamics, is proposed and analyzed. The kernels are assumed to be in detailed balance and satisfy a weak cross-diffusion condition. The latter condition allows for negative off-diagonal coefficients and for kernels defined by an indicator function. The scheme preserves the nonnegativity of the densities, conservation of mass, and production of the Boltzmann and Rao entropies. The key idea is to ``translate'' the entropy calculations for the continuous equations to the finite-volume scheme, in particular to design discretizations of the mobilities, which guarantee a discrete chain rule even in the presence of nonlocal terms. Based on this idea, the existence of finite-volume solutions and the convergence of the scheme are proven. As a by-product, we deduce the existence of weak solutions to the continuous cross-diffusion system. Finally, we present some numerical experiments illustrating the behavior of the solutions to the nonlocal and associated local models.
Hyperspectral imaging has the potential to improve intraoperative decision making if tissue characterisation is performed in real-time and with high-resolution. Hyperspectral snapshot mosaic sensors offer a promising approach due to their fast acquisition speed and compact size. However, a demosaicking algorithm is required to fully recover the spatial and spectral information of the snapshot images. Most state-of-the-art demosaicking algorithms require ground-truth training data with paired snapshot and high-resolution hyperspectral images, but such imagery pairs with the exact same scene are physically impossible to acquire in intraoperative settings. In this work, we present a fully unsupervised hyperspectral image demosaicking algorithm which only requires exemplar snapshot images for training purposes. We regard hyperspectral demosaicking as an ill-posed linear inverse problem which we solve using a deep neural network. We take advantage of the spectral correlation occurring in natural scenes to design a novel inter spectral band regularisation term based on spatial gradient consistency. By combining our proposed term with standard regularisation techniques and exploiting a standard data fidelity term, we obtain an unsupervised loss function for training deep neural networks, which allows us to achieve real-time hyperspectral image demosaicking. Quantitative results on hyperspetral image datasets show that our unsupervised demosaicking approach can achieve similar performance to its supervised counter-part, and significantly outperform linear demosaicking. A qualitative user study on real snapshot hyperspectral surgical images confirms the results from the quantitative analysis. Our results suggest that the proposed unsupervised algorithm can achieve promising hyperspectral demosaicking in real-time thus advancing the suitability of the modality for intraoperative use.
This paper introduces a new extragradient-type algorithm for a class of nonconvex-nonconcave minimax problems. It is well-known that finding a local solution for general minimax problems is computationally intractable. This observation has recently motivated the study of structures sufficient for convergence of first order methods in the more general setting of variational inequalities when the so-called weak Minty variational inequality (MVI) holds. This problem class captures non-trivial structures as we demonstrate with examples, for which a large family of existing algorithms provably converge to limit cycles. Our results require a less restrictive parameter range in the weak MVI compared to what is previously known, thus extending the applicability of our scheme. The proposed algorithm is applicable to constrained and regularized problems, and involves an adaptive stepsize allowing for potentially larger stepsizes. Our scheme also converges globally even in settings where the underlying operator exhibits limit cycles.
Differential machine learning (DML) is a recently proposed technique that uses samplewise state derivatives to regularize least square fits to learn conditional expectations of functionals of stochastic processes as functions of state variables. Exploiting the derivative information leads to fewer samples than a vanilla ML approach for the same level of precision. This paper extends the methodology to parametric problems where the processes and functionals also depend on model and contract parameters, respectively. In addition, we propose adaptive parameter sampling to improve relative accuracy when the functionals have different magnitudes for different parameter sets. For calibration, we construct pricing surrogates for calibration instruments and optimize over them globally. We discuss strategies for robust calibration. We demonstrate the usefulness of our methodology on one-factor Cheyette models with benchmark rate volatility specification with an extra stochastic volatility factor on (two-curve) caplet prices at different strikes and maturities, first for parametric pricing, and then by calibrating to a given caplet volatility surface. To allow convenient and efficient simulation of processes and functionals and in particular the corresponding computation of samplewise derivatives, we propose to specify the processes and functionals in a low-code way close to mathematical notation which is then used to generate efficient computation of the functionals and derivatives in TensorFlow.
Despite being highly over-parametrized, and having the ability to fully interpolate the training data, deep networks are known to generalize well to unseen data. It is now understood that part of the reason for this is that the training algorithms used have certain implicit regularization properties that ensure interpolating solutions with "good" properties are found. This is best understood in linear over-parametrized models where it has been shown that the celebrated stochastic gradient descent (SGD) algorithm finds an interpolating solution that is closest in Euclidean distance to the initial weight vector. Different regularizers, replacing Euclidean distance with Bregman divergence, can be obtained if we replace SGD with stochastic mirror descent (SMD). Empirical observations have shown that in the deep network setting, SMD achieves a generalization performance that is different from that of SGD (and which depends on the choice of SMD's potential function. In an attempt to begin to understand this behavior, we obtain the generalization error of SMD for over-parametrized linear models for a binary classification problem where the two classes are drawn from a Gaussian mixture model. We present simulation results that validate the theory and, in particular, introduce two data models, one for which SMD with an $\ell_2$ regularizer (i.e., SGD) outperforms SMD with an $\ell_1$ regularizer, and one for which the reverse happens.
In this paper we develop finite difference schemes for elliptic problems with piecewise continuous coefficients that have (possibly huge) jumps across fixed internal interfaces. In contrast with such problems involving one smooth non-intersecting interface, that have been extensively studied, there are very few papers addressing elliptic interface problems with intersecting interfaces of coefficient jumps. It is well known that if the values of the permeability in the four subregions around a point of intersection of two such internal interfaces are all different, the solution has a point singularity that significantly affects the accuracy of the approximation in the vicinity of the intersection point. In the present paper we propose a fourth-order 9-point finite difference scheme on uniform Cartesian meshes for an elliptic problem whose coefficient is piecewise constant in four rectangular subdomains of the overall two-dimensional rectangular domain. Moreover, for the special case when the intersecting point of the two lines of coefficient jumps is a grid point, such a compact scheme, involving relatively simple formulas for computation of the stencil coefficients, can even reach sixth order of accuracy. Furthermore, we show that the resulting linear system for the special case has an M-matrix, and prove the theoretical sixth order convergence rate using the discrete maximum principle. Our numerical experiments demonstrate the fourth (for the general case) and sixth (for the special case) accuracy orders of the proposed schemes. In the general case, we derive a compact third-order finite difference scheme, also yielding a linear system with an M-matrix. In addition, using the discrete maximum principle, we prove the third order convergence rate of the scheme for the general elliptic cross-interface problem.
Forward gradient learning computes a noisy directional gradient and is a biologically plausible alternative to backprop for learning deep neural networks. However, the standard forward gradient algorithm, when applied naively, suffers from high variance when the number of parameters to be learned is large. In this paper, we propose a series of architectural and algorithmic modifications that together make forward gradient learning practical for standard deep learning benchmark tasks. We show that it is possible to substantially reduce the variance of the forward gradient estimator by applying perturbations to activations rather than weights. We further improve the scalability of forward gradient by introducing a large number of local greedy loss functions, each of which involves only a small number of learnable parameters, and a new MLPMixer-inspired architecture, LocalMixer, that is more suitable for local learning. Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
A common approach to approximating Gaussian log-likelihoods at scale exploits the fact that precision matrices can be well-approximated by sparse matrices in some circumstances. This strategy is motivated by the \emph{screening effect}, which refers to the phenomenon in which the linear prediction of a process $Z$ at a point $\mathbf{x}_0$ depends primarily on measurements nearest to $\mathbf{x}_0$. But simple perturbations, such as i.i.d. measurement noise, can significantly reduce the degree to which this exploitable phenomenon occurs. While strategies to cope with this issue already exist and are certainly improvements over ignoring the problem, in this work we present a new one based on the EM algorithm that offers several advantages. While in this work we focus on the application to Vecchia's approximation (1988), a particularly popular and powerful framework in which we can demonstrate true second-order optimization of M steps, the method can also be applied using entirely matrix-vector products, making it applicable to a very wide class of precision matrix-based approximation methods.
Knowledge graph embedding (KGE) is a increasingly popular technique that aims to represent entities and relations of knowledge graphs into low-dimensional semantic spaces for a wide spectrum of applications such as link prediction, knowledge reasoning and knowledge completion. In this paper, we provide a systematic review of existing KGE techniques based on representation spaces. Particularly, we build a fine-grained classification to categorise the models based on three mathematical perspectives of the representation spaces: (1) Algebraic perspective, (2) Geometric perspective, and (3) Analytical perspective. We introduce the rigorous definitions of fundamental mathematical spaces before diving into KGE models and their mathematical properties. We further discuss different KGE methods over the three categories, as well as summarise how spatial advantages work over different embedding needs. By collating the experimental results from downstream tasks, we also explore the advantages of mathematical space in different scenarios and the reasons behind them. We further state some promising research directions from a representation space perspective, with which we hope to inspire researchers to design their KGE models as well as their related applications with more consideration of their mathematical space properties.