We revisited the relation between the gradient equations and Hamilton's equations in information geometry. By regarding a gradient-flow equation in information geometry as Huygens' equation in geometric optics, we relate the gradient flow to the geodesic flow induced by a geodesic Hamiltonian in Riemannian geometry. The original time parameter $t$ in the gradient equations is related to the arc-length parameter in the Riemannian manifold by Jacobi-Maupertuis transformation. As a by-product, it is found the relation between the gradient equation and replicator equations.
This paper mainly investigates the strong convergence and stability of the truncated Euler-Maruyama (EM) method for stochastic differential delay equations with variable delay whose coefficients can be growing super-linearly. By constructing appropriate truncated functions to control the super-linear growth of the original coefficients, we present a type of the truncated EM method for such SDDEs with variable delay, which is proposed to be approximated by the value taken at the nearest grid points on the left of the delayed argument. The strong convergence result (without order) of the method is established under the local Lipschitz plus generalized Khasminskii-type conditions and the optimal strong convergence order $1/2$ can be obtained if the global monotonicity with U function and polynomial growth conditions are added to the assumptions. Moreover, the partially truncated EM method is proved to preserve the mean-square and H_\infty stabilities of the true solutions. Compared with the known results on the truncated EM method for SDDEs, a better order of strong convergence is obtained under more relaxing conditions on the coefficients, and more refined technical estimates are developed so as to overcome the challenges arising due to variable delay. Lastly, some numerical examples are utilized to confirm the effectiveness of the theoretical results.
In the present work, we investigate a cut finite element method for the parameterized system of second-order equations stemming from the splitting approach of a fourth order nonlinear geometrical PDE, namely the Cahn-Hilliard system. We manage to tackle the instability issues of such methods whenever strong nonlinearities appear and to utilize their flexibility of the fixed background geometry -- and mesh -- characteristic, through which, one can avoid e.g. in parametrized geometries the remeshing on the full order level, as well as, transformations to reference geometries on the reduced level. As a final goal, we manage to find an efficient global, concerning the geometrical manifold, and independent of geometrical changes, reduced order basis. The POD-Galerkin approach exhibits its strength even with pseudo-random discontinuous initial data verified by numerical experiments.
A dimension reduction method based on the "Nonlinear Level set Learning" (NLL) approach is presented for the pointwise prediction of functions which have been sparsely sampled. Leveraging geometric information provided by the Implicit Function Theorem, the proposed algorithm effectively reduces the input dimension to the theoretical lower bound with minor accuracy loss, providing a one-dimensional representation of the function which can be used for regression and sensitivity analysis. Experiments and applications are presented which compare this modified NLL with the original NLL and the Active Subspaces (AS) method. While accommodating sparse input data, the proposed algorithm is shown to train quickly and provide a much more accurate and informative reduction than either AS or the original NLL on two example functions with high-dimensional domains, as well as two state-dependent quantities depending on the solutions to parametric differential equations.
Scientific machine learning has been successfully applied to inverse problems and PDE discoveries in computational physics. One caveat of current methods however is the need for large amounts of (clean) data in order to recover full system responses or underlying physical models. Bayesian methods may be particularly promising to overcome these challenges as they are naturally less sensitive to sparse and noisy data. In this paper, we propose to use Bayesian neural networks (BNN) in order to: 1) Recover the full system states from measurement data (e.g. temperature, velocity field, etc.). We use Hamiltonian Monte-Carlo to sample the posterior distribution of a deep and dense BNN, and show that it is possible to accurately capture physics of varying complexity without overfitting. 2) Recover the parameters in the underlying partial differential equation (PDE) governing the physical system. Using the trained BNN as a surrogate of the system response, we generate datasets of derivatives potentially comprising the latent PDE of the observed system and perform a Bayesian linear regression (BLR) between the successive derivatives in space and time to recover the original PDE parameters. We take advantage of the confidence intervals on the BNN outputs and introduce the spatial derivative variance into the BLR likelihood to discard the influence of highly uncertain surrogate data points, which allows for more accurate parameter discovery. We demonstrate our approach on a handful of example applied to physics and non-linear dynamics.
In the context of flow visualization a triple decomposition of the velocity gradient into irrotational straining flow, shear flow and rigid body rotational flow was proposed by Kolar in 2007 [V. Kolar, International journal of heat and fluid flow, 28, 638, (2007)], which has recently received renewed interest. The triple decomposition opens for a refined energy stability analysis of the Navier-Stokes equations, with implications for the mathematical analysis of the structure, computability and regularity of turbulent flow. We here perform an energy stability analysis of turbulent incompressible flow, which suggests a scenario where at macroscopic scales any exponentially unstable irrotational straining flow structures rapidly evolve towards linearly unstable shear flow and stable rigid body rotational flow. This scenario does not rule out irrotational straining flow close to the Kolmogorov microscales, since there viscous dissipation stabilizes the unstable flow structures. In contrast to worst case energy stability estimates, this refined stability analysis reflects the existence of stable flow structures in turbulence over extended time.
We derive boundary conditions and estimates based on the energy and entropy analysis of systems of the nonlinear shallow water equations in two spatial dimensions. It is shown that the energy method provides more details, but is fully consistent with the entropy analysis. The details brought forward by the nonlinear energy analysis allow us to pinpoint where the difference between the linear and nonlinear analysis originate. We find that the result from the linear analysis does not necessarily hold in the nonlinear case. The nonlinear analysis leads in general to a different minimal number of boundary conditions compared with the linear analysis. In particular, and contrary to the linear case, the magnitude of the flow does not influence the number of required boundary conditions.
We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a black-box differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly trade numerical precision for speed. We demonstrate these properties in continuous-depth residual networks and continuous-time latent variable models. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions. For training, we show how to scalably backpropagate through any ODE solver, without access to its internal operations. This allows end-to-end training of ODEs within larger models.
In this paper, we propose a nonlinear distance metric learning scheme based on the fusion of component linear metrics. Instead of merging displacements at each data point, our model calculates the velocities induced by the component transformations, via a geodesic interpolation on a Lie transfor- mation group. Such velocities are later summed up to produce a global transformation that is guaranteed to be diffeomorphic. Consequently, pair-wise distances computed this way conform to a smooth and spatially varying metric, which can greatly benefit k-NN classification. Experiments on synthetic and real datasets demonstrate the effectiveness of our model.
The concept of Fisher information can be useful even in cases where the probability distributions of interest are not absolutely continuous with respect to the natural reference measure on the underlying space. Practical examples where this extension is useful are provided in the context of multi-object tracking statistical models. Upon defining the Fisher information without introducing a reference measure, we provide remarkably concise proofs of the loss of Fisher information in some widely used multi-object tracking observation models.
The Fisher information metric is an important foundation of information geometry, wherein it allows us to approximate the local geometry of a probability distribution. Recurrent neural networks such as the Sequence-to-Sequence (Seq2Seq) networks that have lately been used to yield state-of-the-art performance on speech translation or image captioning have so far ignored the geometry of the latent embedding, that they iteratively learn. We propose the information geometric Seq2Seq (GeoSeq2Seq) network which abridges the gap between deep recurrent neural networks and information geometry. Specifically, the latent embedding offered by a recurrent network is encoded as a Fisher kernel of a parametric Gaussian Mixture Model, a formalism common in computer vision. We utilise such a network to predict the shortest routes between two nodes of a graph by learning the adjacency matrix using the GeoSeq2Seq formalism; our results show that for such a problem the probabilistic representation of the latent embedding supersedes the non-probabilistic embedding by 10-15\%.