This paper examines functional equivariance, recently introduced by McLachlan and Stern [Found. Comput. Math. (2022)], from the perspective of backward error analysis. We characterize the evolution of certain classes of observables (especially affine and quadratic) by structure-preserving numerical integrators in terms of their modified vector fields. Several results on invariant preservation and symplecticity of modified vector fields are thereby generalized to describe the numerical evolution of non-invariant observables.
We describe an efficient method for the approximation of functions using radial basis functions (RBFs), and extend this to a solver for boundary value problems on irregular domains. The method is based on RBFs with centers on a regular grid defined on a bounding box, with some of the centers outside the computational domain. The equation is discretized using collocation with oversampling, with collocation points inside the domain only, resulting in a rectangular linear system to be solved in a least squares sense. The goal of this paper is the efficient solution of that rectangular system. We show that the least squares problem splits into a regular part, which can be expedited with the FFT, and a low rank perturbation, which is treated separately with a direct solver. The rank of the perturbation is influenced by the irregular shape of the domain and by the weak enforcement of boundary conditions at points along the boundary. The solver extends the AZ algorithm which was previously proposed for function approximation involving frames and other overcomplete sets. The solver has near optimal log-linear complexity for univariate problems, and loses optimality for higher-dimensional problems but remains faster than a direct solver.
We develop a new continuous-time stochastic gradient descent method for optimizing over the stationary distribution of stochastic differential equation (SDE) models. The algorithm continuously updates the SDE model's parameters using an estimate for the gradient of the stationary distribution. The gradient estimate is simultaneously updated using forward propagation of the SDE state derivatives, asymptotically converging to the direction of steepest descent. We rigorously prove convergence of the online forward propagation algorithm for linear SDE models (i.e., the multi-dimensional Ornstein-Uhlenbeck process) and present its numerical results for nonlinear examples. The proof requires analysis of the fluctuations of the parameter evolution around the direction of steepest descent. Bounds on the fluctuations are challenging to obtain due to the online nature of the algorithm (e.g., the stationary distribution will continuously change as the parameters change). We prove bounds for the solutions of a new class of Poisson partial differential equations (PDEs), which are then used to analyze the parameter fluctuations in the algorithm. Our algorithm is applicable to a range of mathematical finance applications involving statistical calibration of SDE models and stochastic optimal control for long time horizons where ergodicity of the data and stochastic process is a suitable modeling framework. Numerical examples explore these potential applications, including learning a neural network control for high-dimensional optimal control of SDEs and training stochastic point process models of limit order book events.
We propose a novel Hadamard integrator for the self-adjoint time-dependent wave equation in an inhomogeneous medium. First, we create a new asymptotic series based on the Gelfand-Shilov function, dubbed Hadamard's ansatz, to approximate the Green's function of the time-dependent wave equation. Second, incorporating the leading term of Hadamard's ansatz into the Kirchhoff-Huygens representation, we develop an original Hadamard integrator for the Cauchy problem of the time-dependent wave equation and derive the corresponding Lagrangian formulation in geodesic polar coordinates. Third, to construct the Hadamard integrator in the Lagrangian formulation efficiently, we use a short-time ray tracing method to obtain wavefront locations accurately, and we further develop fast algorithms to compute Chebyshev-polynomial based low-rank representations of both wavefront locations and variants of Hadamard coefficients. Fourth, equipped with these low-rank representations, we apply the Hadamard integrator to efficiently solve time-dependent wave equations with highly oscillatory initial conditions, where the time step size is independent of the initial conditions. By judiciously choosing the medium-dependent time step, our new Hadamard integrator can propagate wave field beyond caustics implicitly and advance spatially overturning waves in time naturally. Moreover, since the integrator is independent of initial conditions, the Hadamard integrator can be applied to many different initial conditions once it is constructed. Both two-dimensional and three-dimensional numerical examples illustrate the accuracy and performance of the proposed method.
Machine learning techniques, in particular the so-called normalizing flows, are becoming increasingly popular in the context of Monte Carlo simulations as they can effectively approximate target probability distributions. In the case of lattice field theories (LFT) the target distribution is given by the exponential of the action. The common loss function's gradient estimator based on the "reparametrization trick" requires the calculation of the derivative of the action with respect to the fields. This can present a significant computational cost for complicated, non-local actions like e.g. fermionic action in QCD. In this contribution, we propose an estimator for normalizing flows based on the REINFORCE algorithm that avoids this issue. We apply it to two dimensional Schwinger model with Wilson fermions at criticality and show that it is up to ten times faster in terms of the wall-clock time as well as requiring up to $30\%$ less memory than the reparameterization trick estimator. It is also more numerically stable allowing for single precision calculations and the use of half-float tensor cores. We present an in-depth analysis of the origins of those improvements. We believe that these benefits will appear also outside the realm of the LFT, in each case where the target probability distribution is computationally intensive.
This paper proposes a hierarchy of numerical fluxes for the compressible flow equations which are kinetic-energy and pressure equilibrium preserving and asymptotically entropy conservative, i.e., they are able to arbitrarily reduce the numerical error on entropy production due to the spatial discretization. The fluxes are based on the use of the harmonic mean for internal energy and only use algebraic operations, making them less computationally expensive than the entropy-conserving fluxes based on the logarithmic mean. The use of the geometric mean is also explored and identified to be well-suited to reduce errors on entropy evolution. Results of numerical tests confirmed the theoretical predictions and the entropy-conserving capabilities of a selection of schemes have been compared.
We review common situations in Bayesian latent variable models where the prior distribution that a researcher specifies differs from the prior distribution used during estimation. These situations can arise from the positive definite requirement on correlation matrices, from sign indeterminacy of factor loadings, and from order constraints on threshold parameters. The issue is especially problematic for reproducibility and for model checks that involve prior distributions, including prior predictive assessment and Bayes factors. In these cases, one might be assessing the wrong model, casting doubt on the relevance of the results. The most straightforward solution to the issue sometimes involves use of informative prior distributions. We explore other solutions and make recommendations for practice.
Threshold tolerance graphs and their complement graphs ( known as co-TT graphs) were introduced by Monma, Reed and Trotter[24]. Introducing the concept of negative interval Hell et al.[19] defined signed-interval bigraphs/digraphs and have shown that they are equivalent to several seemingly different classes of bigraphs/digraphs. They have also shown that co-TT graphs are equivalent to symmetric signed-interval digraphs. In this paper we characterize signed-interval bigraphs and signed-interval graphs respectively in terms of their biadjacency matrices and adjacency matrices. Finally, based on the geometric representation of signed-interval graphs we have setteled the open problem of forbidden induced subgraph characterization of co-TT graphs posed by Monma, Reed and Trotter in the same paper.
We study the power of randomness in the Number-on-Forehead (NOF) model in communication complexity. We construct an explicit 3-player function $f:[N]^3 \to \{0,1\}$, such that: (i) there exist a randomized NOF protocol computing it that sends a constant number of bits; but (ii) any deterministic or nondeterministic NOF protocol computing it requires sending about $(\log N)^{1/3}$ many bits. This exponentially improves upon the previously best-known such separation. At the core of our proof is an extension of a recent result of the first and third authors on sets of integers without 3-term arithmetic progressions into a non-arithmetic setting.
The goal of explainable Artificial Intelligence (XAI) is to generate human-interpretable explanations, but there are no computationally precise theories of how humans interpret AI generated explanations. The lack of theory means that validation of XAI must be done empirically, on a case-by-case basis, which prevents systematic theory-building in XAI. We propose a psychological theory of how humans draw conclusions from saliency maps, the most common form of XAI explanation, which for the first time allows for precise prediction of explainee inference conditioned on explanation. Our theory posits that absent explanation humans expect the AI to make similar decisions to themselves, and that they interpret an explanation by comparison to the explanations they themselves would give. Comparison is formalized via Shepard's universal law of generalization in a similarity space, a classic theory from cognitive science. A pre-registered user study on AI image classifications with saliency map explanations demonstrate that our theory quantitatively matches participants' predictions of the AI.
We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.