In this work, we establish the Freidlin--Wentzell large deviations principle (LDP) of the stochastic Cahn--Hilliard equation with small noise, which implies the one-point LDP. Further, we give the one-point LDP of the spatial finite difference method (FDM) for the stochastic Cahn--Hilliard equation. Our main result is the convergence of the one-point large deviations rate function (LDRF) of the spatial FDM, which is about the asymptotical limit of a parametric variational problem. The main idea for proving the convergence of the LDRF of the spatial FDM is via the $\Gamma$-convergence of objective functions, which relies on the qualitative analysis of skeleton equations of the original equation and the numerical method. In order to overcome the difficulty that the drift coefficient is not one-side Lipschitz, we use the equivalent characterization of the skeleton equation of the spatial FDM and the discrete interpolation inequality to obtain the uniform boundedness of the solution to the underlying skeleton equation. This plays an important role in deriving the $\Gamma$-convergence of objective functions.
Two combined numerical methods for solving time-varying semilinear differential-algebraic equations (DAEs) are obtained. The convergence and correctness of the methods are proved. When constructing the methods, time-varying spectral projectors which can be found numerically are used. This enables to numerically solve the DAE in the original form without additional analytical transformations. To improve the accuracy of the second method, recalculation is used. The developed methods are applicable to the DAEs with the continuous nonlinear part which may not be differentiable in time, and the restrictions of the type of the global Lipschitz condition are not used in the presented theorems on the DAE global solvability and the convergence of the methods. This extends the scope of methods. The fulfillment of the conditions of the global solvability theorem ensures the existence of a unique exact solution on any given time interval, which enables to seek an approximate solution also on any time interval. Numerical examples illustrating the capabilities of the methods and their effectiveness in various situations are provided. To demonstrate this, mathematical models of the dynamics of electrical circuits are considered. It is shown that the results of the theoretical and numerical analyses of these models are consistent.
Empirical neural tangent kernels (eNTKs) can provide a good understanding of a given network's representation: they are often far less expensive to compute and applicable more broadly than infinite width NTKs. For networks with O output units (e.g. an O-class classifier), however, the eNTK on N inputs is of size $NO \times NO$, taking $O((NO)^2)$ memory and up to $O((NO)^3)$ computation. Most existing applications have therefore used one of a handful of approximations yielding $N \times N$ kernel matrices, saving orders of magnitude of computation, but with limited to no justification. We prove that one such approximation, which we call "sum of logits", converges to the true eNTK at initialization for any network with a wide final "readout" layer. Our experiments demonstrate the quality of this approximation for various uses across a range of settings.
Many real-world tasks include some kind of parameter estimation, i.e., determination of a parameter encoded in a probability distribution. Often, such probability distributions arise from stochastic processes. For a stationary stochastic process with temporal correlations, the random variables that constitute it are identically distributed but not independent. This is the case, for instance, for quantum continuous measurements. In this paper we prove two fundamental results concerning the estimation of parameters encoded in a memoryful stochastic process. First, we show that for processes with finite Markov order, the Fisher information is always asymptotically linear in the number of outcomes, and determined by the conditional distribution of the process' Markov order. Second, we prove with suitable examples that correlations do not necessarily enhance the metrological precision. In fact, we show that unlike for entropic information quantities, in general nothing can be said about the sub- or super-additivity of the joint Fisher information, in the presence of correlations. We discuss how the type of correlations in the process affects the scaling. We then apply these results to the case of thermometry on a spin chain.
We consider the problem of continuous-time policy evaluation. This consists in learning through observations the value function associated with an uncontrolled continuous-time stochastic dynamic and a reward function. We propose two original variants of the well-known TD(0) method using vanishing time steps. One is model-free and the other is model-based. For both methods, we prove theoretical convergence rates that we subsequently verify through numerical simulations. Alternatively, those methods can be interpreted as novel reinforcement learning approaches for approximating solutions of linear PDEs (partial differential equations) or linear BSDEs (backward stochastic differential equations).
We introduce a new class of spatially stochastic physics and data informed deep latent models for parametric partial differential equations (PDEs) which operate through scalable variational neural processes. We achieve this by assigning probability measures to the spatial domain, which allows us to treat collocation grids probabilistically as random variables to be marginalised out. Adapting this spatial statistics view, we solve forward and inverse problems for parametric PDEs in a way that leads to the construction of Gaussian process models of solution fields. The implementation of these random grids poses a unique set of challenges for inverse physics informed deep learning frameworks and we propose a new architecture called Grid Invariant Convolutional Networks (GICNets) to overcome these challenges. We further show how to incorporate noisy data in a principled manner into our physics informed model to improve predictions for problems where data may be available but whose measurement location does not coincide with any fixed mesh or grid. The proposed method is tested on a nonlinear Poisson problem, Burgers equation, and Navier-Stokes equations, and we provide extensive numerical comparisons. We demonstrate significant computational advantages over current physics informed neural learning methods for parametric PDEs while improving the predictive capabilities and flexibility of these models.
The numerical evaluation of statistics plays a crucial role in statistical physics and its applied fields. It is possible to evaluate the statistics for a stochastic differential equation with Gaussian white noise via the corresponding backward Kolmogorov equation. The important notice is that there is no need to obtain the solution of the backward Kolmogorov equation on the whole domain; it is enough to evaluate a value of the solution at a certain point that corresponds to the initial coordinate for the stochastic differential equation. For this aim, an algorithm based on combinatorics has recently been developed. In this paper, we discuss a higher-order approximation of resolvent, and an algorithm based on a second-order approximation is proposed. The proposed algorithm shows a second-order convergence. Furthermore, the convergence property of the naive algorithms naturally leads to extrapolation methods; they work well to calculate a more accurate value with fewer computational costs. The proposed method is demonstrated with the Ornstein-Uhlenbeck process and the noisy van der Pol system.
Modern SAT solvers are designed to handle problems expressed in Conjunctive Normal Form (CNF) so that non-CNF problems must be CNF-ized upfront, typically by using variants of either Tseitin or Plaisted and Greenbaum transformations. When passing from solving to enumeration, however, the capability of producing partial satisfying assignments that are as small as possible becomes crucial, which raises the question of whether such CNF encodings are also effective for enumeration. In this paper, we investigate both theoretically and empirically the effectiveness of CNF conversions for SAT enumeration. On the negative side, we show that: (i) Tseitin transformation prevents the solver from producing short partial assignments, thus seriously affecting the effectiveness of enumeration; (ii) Plaisted and Greenbaum transformation overcomes this problem only in part. On the positive side, we show that combining Plaisted and Greenbaum transformation with NNF preprocessing upfront -- which is typically not used in solving -- can fully overcome the problem and can drastically reduce both the number of partial assignments and the execution time.
In this paper, we consider enumeration of geodesics on a polyhedron, where a geodesic means locally-shortest path between two points. Particularly, we consider the following preprocessing problem: given a point $s$ on a polyhedral surface and a positive real number $r$, to build a data structure that enables, for any point $t$ on the surface, to enumerate all geodesics from $s$ to $t$ whose length is less than $r$. First, we present a naive algorithm by removing the trimming process from the MMP algorithm (1987). Next, we present an improved algorithm which is practically more efficient on a non-convex polyhedron, in terms of preprocessing time and memory consumption. Moreover, we introduce a single-pair geodesic graph to succinctly encode a result of geodesic query. Lastly, we compare these naive and improved algorithms by some computer experiments.
Stochastic approximation with multiple coupled sequences (MSA) has found broad applications in machine learning as it encompasses a rich class of problems including bilevel optimization (BLO), multi-level compositional optimization (MCO), and reinforcement learning (specifically, actor-critic methods). However, designing provably-efficient federated algorithms for MSA has been an elusive question even for the special case of double sequence approximation (DSA). Towards this goal, we develop FedMSA which is the first federated algorithm for MSA, and establish its near-optimal communication complexity. As core novelties, (i) FedMSA enables the provable estimation of hypergradients in BLO and MCO via local client updates, which has been a notable bottleneck in prior theory, and (ii) our convergence guarantees are sensitive to the heterogeneity-level of the problem. We also incorporate momentum and variance reduction techniques to achieve further acceleration leading to near-optimal rates. Finally, we provide experiments that support our theory and demonstrate the empirical benefits of FedMSA. As an example, FedMSA enables order-of-magnitude savings in communication rounds compared to prior federated BLO schemes.
We investigate error bounds for numerical solutions of divergence structure linear elliptic PDEs on compact manifolds without boundary. Our focus is on a class of monotone finite difference approximations, which provide a strong form of stability that guarantees the existence of a bounded solution. In many settings including the Dirichlet problem, it is easy to show that the resulting solution error is proportional to the formal consistency error of the scheme. We make the surprising observation that this need not be true for PDEs posed on compact manifolds without boundary. We propose a particular class of approximation schemes built around an underlying monotone scheme with consistency error $O(h^\alpha)$. By carefully constructing barrier functions, we prove that the solution error is bounded by $O(h^{\alpha/(d+1)})$ in dimension $d$. We also provide a specific example where this predicted convergence rate is observed numerically. Using these error bounds, we further design a family of provably convergent approximations to the solution gradient.