We consider the Virtual Element method (VEM) introduced by Beir\~ao da Veiga, Lovadina and Vacca in 2016 for the numerical solution of the steady, incompressible Navier-Stokes equations; the method has arbitrary order $k \geq 2$ and guarantees divergence-free velocities. For such discretization, we develop a residual-based a posteriori error estimator, which is a combination of standard terms in VEM analysis (residual terms, data oscillation, and VEM stabilization), plus some other terms originated by the VEM discretization of the nonlinear convective term. We show that a linear combination of the velocity and pressure errors is upper-bounded by a multiple of the estimator (reliability). We also establish some efficiency results, involving lower bounds of the error. Some numerical tests illustrate the performance of the estimator and of its components while refining the mesh uniformly, yielding the expected decay rate. At last, we apply an adaptive mesh refinement strategy to the computation of the low-Reynolds flow around a square cylinder inside a channel.
In order for agents in multi-agent systems (MAS) to be safe, they need to take into account the risks posed by the actions of other agents. However, the dominant paradigm in game theory (GT) assumes that agents are not affected by risk from other agents and only strive to maximise their expected utility. For example, in hybrid human-AI driving systems, it is necessary to limit large deviations in reward resulting from car crashes. Although there are equilibrium concepts in game theory that take into account risk aversion, they either assume that agents are risk-neutral with respect to the uncertainty caused by the actions of other agents, or they are not guaranteed to exist. We introduce a new GT-based Risk-Averse Equilibrium (RAE) that always produces a solution that minimises the potential variance in reward accounting for the strategy of other agents. Theoretically and empirically, we show RAE shares many properties with a Nash Equilibrium (NE), establishing convergence properties and generalising to risk-dominant NE in certain cases. To tackle large-scale problems, we extend RAE to the PSRO multi-agent reinforcement learning (MARL) framework. We empirically demonstrate the minimum reward variance benefits of RAE in matrix games with high-risk outcomes. Results on MARL experiments show RAE generalises to risk-dominant NE in a trust dilemma game and that it reduces instances of crashing by 7x in an autonomous driving setting versus the best performing baseline.
In recent years, empirical Bayesian (EB) inference has become an attractive approach for estimation in parametric models arising in a variety of real-life problems, especially in complex and high-dimensional scientific applications. However, compared to the relative abundance of available general methods for computing point estimators in the EB framework, the construction of confidence sets and hypothesis tests with good theoretical properties remains difficult and problem specific. Motivated by the universal inference framework of Wasserman et al. (2020), we propose a general and universal method, based on holdout likelihood ratios, and utilizing the hierarchical structure of the specified Bayesian model for constructing confidence sets and hypothesis tests that are finite sample valid. We illustrate our method through a range of numerical studies and real data applications, which demonstrate that the approach is able to generate useful and meaningful inferential statements in the relevant contexts.
We adopt an information-theoretic framework to analyze the generalization behavior of the class of iterative, noisy learning algorithms. This class is particularly suitable for study under information-theoretic metrics as the algorithms are inherently randomized, and it includes commonly used algorithms such as Stochastic Gradient Langevin Dynamics (SGLD). Herein, we use the maximal leakage (equivalently, the Sibson mutual information of order infinity) metric, as it is simple to analyze, and it implies both bounds on the probability of having a large generalization error and on its expected value. We show that, if the update function (e.g., gradient) is bounded in $L_2$-norm, then adding isotropic Gaussian noise leads to optimal generalization bounds: indeed, the input and output of the learning algorithm in this case are asymptotically statistically independent. Furthermore, we demonstrate how the assumptions on the update function affect the optimal (in the sense of minimizing the induced maximal leakage) choice of the noise. Finally, we compute explicit tight upper bounds on the induced maximal leakage for several scenarios of interest.
The power prior is a popular class of informative priors for incorporating information from historical data. It involves raising the likelihood for the historical data to a power, which acts as discounting parameter. When the discounting parameter is modelled as random, the normalized power prior is recommended. In this work, we prove that the marginal posterior for the discounting parameter for generalized linear models converges to a point mass at zero if there is any discrepancy between the historical and current data, and that it does not converge to a point mass at one when they are fully compatible. In addition, we explore the construction of optimal priors for the discounting parameter in a normalized power prior. In particular, we are interested in achieving the dual objectives of encouraging borrowing when the historical and current data are compatible and limiting borrowing when they are in conflict. We propose intuitive procedures for eliciting the shape parameters of a beta prior for the discounting parameter based on two minimization criteria, the Kullback-Leibler divergence and the mean squared error. Based on the proposed criteria, the optimal priors derived are often quite different from commonly used priors such as the uniform prior.
In topology optimization of fluid-dependent problems, there is a need to interpolate within the design domain between fluid and solid in a continuous fashion. In density-based methods, the concept of inverse permeability in the form of a volumetric force is utilized to enforce zero fluid velocity in non-fluid regions. This volumetric force consists of a scalar term multiplied by the fluid velocity. This scalar term takes a value between two limits as determined by a convex interpolation function. The maximum inverse permeability limit is typically chosen through a trial and error analysis of the initial form of the optimization problem; such that the fields resolved resemble those obtained through an analysis of a pure fluid domain with a body-fitted mesh. In this work, we investigate the dependency of the maximum inverse permeability limit on the mesh size and the flow conditions through analyzing the Navier-Stokes equation in its strong as well as discretized finite element forms. We use numerical experiments to verify and characterize these dependencies.
This paper provides mathematical analysis of an elementary fully discrete finite difference method applied to inhomogeneous (non-constant density and viscosity) incompressible Navier-Stokes system on a bounded domain. The proposed method consists of a version of Lax-Friedrichs explicit scheme for the transport equation and a version of Ladyzhenskaya's implicit scheme for the Navier-Stokes equations. Under the condition that the initial density profile is strictly away from $0$, the scheme is proven to be strongly convergent to a weak solution (up to a subsequence) within an arbitrary time interval, which can be seen as a proof of existence of a weak solution to the system. The results contain a new Aubin-Lions-Simon type compactness method with an interpolation inequality between strong norms of the velocity and a weak norm of the product of the density and velocity.
Kinetic equations model the position-velocity distribution of particles subject to transport and collision effects. Under a diffusive scaling, these combined effects converge to a diffusion equation for the position density in the limit of an infinite collision rate. Despite this well-defined limit, numerical simulation is expensive when the collision rate is high but finite, as small time steps are then required. In this work, we present an asymptotic-preserving multilevel Monte Carlo particle scheme that makes use of this diffusive limit to accelerate computations. In this scheme, we first sample the diffusive limiting model to compute a biased initial estimate of a Quantity of Interest, using large time steps. We then perform a limited number of finer simulations with transport and collision dynamics to correct the bias. The efficiency of the multilevel method depends on being able to perform correlated simulations of particles on a hierarchy of discretization levels. We present a method for correlating particle trajectories and present both an analysis and numerical experiments. We demonstrate that our approach significantly reduces the cost of particle simulations in high-collisional regimes, compared with prior work, indicating significant potential for adopting these schemes in various areas of active research.
When implementing the gradient descent method in low precision, the employment of stochastic rounding schemes helps to prevent stagnation of convergence caused by the vanishing gradient effect. Unbiased stochastic rounding yields zero bias by preserving small updates with probabilities proportional to their relative magnitudes. This study provides a theoretical explanation for the stagnation of the gradient descent method in low-precision computation. Additionally, we propose two new stochastic rounding schemes that trade the zero bias property with a larger probability to preserve small gradients. Our methods yield a constant rounding bias that, on average, lies in a descent direction. For convex problems, we prove that the proposed rounding methods typically have a beneficial effect on the convergence rate of gradient descent. We validate our theoretical analysis by comparing the performances of various rounding schemes when optimizing a multinomial logistic regression model and when training a simple neural network with an 8-bit floating-point format.
Time-dependent Maxwell's equations govern electromagnetics. Under certain conditions, we can rewrite these equations into a partial differential equation of second order, which in this case is the vectorial wave equation. For the vectorial wave, we investigate the numerical application and the challenges in the implementation. For this purpose, we consider a space-time variational setting, i.e. time is just another spatial dimension. More specifically, we apply integration by parts in time as well as in space, leading to a space-time variational formulation with different trial and test spaces. Conforming discretizations of tensor-product type result in a Galerkin--Petrov finite element method that requires a CFL condition for stability. For this Galerkin--Petrov variational formulation, we study the CFL condition and its sharpness. To overcome the CFL condition, we use a Hilbert-type transformation that leads to a variational formulation with equal trial and test spaces. Conforming space-time discretizations result in a new Galerkin--Bubnov finite element method that is unconditionally stable. In numerical examples, we demonstrate the effectiveness of this Galerkin--Bubnov finite element method. Furthermore, we investigate different projections of the right-hand side and their influence on the convergence rates. This paper is the first step towards a more stable computation and a better understanding of vectorial wave equations in a conforming space-time approach.
The conjoining of dynamical systems and deep learning has become a topic of great interest. In particular, neural differential equations (NDEs) demonstrate that neural networks and differential equation are two sides of the same coin. Traditional parameterised differential equations are a special case. Many popular neural network architectures, such as residual networks and recurrent networks, are discretisations. NDEs are suitable for tackling generative problems, dynamical systems, and time series (particularly in physics, finance, ...) and are thus of interest to both modern machine learning and traditional mathematical modelling. NDEs offer high-capacity function approximation, strong priors on model space, the ability to handle irregular data, memory efficiency, and a wealth of available theory on both sides. This doctoral thesis provides an in-depth survey of the field. Topics include: neural ordinary differential equations (e.g. for hybrid neural/mechanistic modelling of physical systems); neural controlled differential equations (e.g. for learning functions of irregular time series); and neural stochastic differential equations (e.g. to produce generative models capable of representing complex stochastic dynamics, or sampling from complex high-dimensional distributions). Further topics include: numerical methods for NDEs (e.g. reversible differential equations solvers, backpropagation through differential equations, Brownian reconstruction); symbolic regression for dynamical systems (e.g. via regularised evolution); and deep implicit models (e.g. deep equilibrium models, differentiable optimisation). We anticipate this thesis will be of interest to anyone interested in the marriage of deep learning with dynamical systems, and hope it will provide a useful reference for the current state of the art.