We study the ability of neural networks to steer or control trajectories of continuous time non-linear dynamical systems on graphs, which we represent with neural ordinary differential equations (neural ODEs). To do so, we introduce a neural-ODE control (NODEC) framework and find that it can learn control signals that drive graph dynamical systems into desired target states. While we use loss functions that do not constrain the control energy, our results show that NODEC produces low energy control signals. Finally, we showcase the performance and versatility of NODEC by using it to control a system of more than one thousand coupled, non-linear ODEs.
Bayesian learning via Stochastic Gradient Langevin Dynamics (SGLD) has been suggested for differentially private learning. While previous research provides differential privacy bounds for SGLD when close to convergence or at the initial steps of the algorithm, the question of what differential privacy guarantees can be made in between remains unanswered. This interim region is essential, especially for Bayesian neural networks, as it is hard to guarantee convergence to the posterior. This paper will show that using SGLD might result in unbounded privacy loss for this interim region, even when sampling from the posterior is as differentially private as desired.
Physics-informed neural networks (PINNs) have become a popular choice for solving high-dimensional partial differential equations (PDEs) due to their excellent approximation power and generalization ability. Recently, Extended PINNs (XPINNs) based on domain decomposition methods have attracted considerable attention due to their effectiveness in modeling multiscale and multiphysics problems and their parallelization. However, theoretical understanding on their convergence and generalization properties remains unexplored. In this study, we take an initial step towards understanding how and when XPINNs outperform PINNs. Specifically, for general multi-layer PINNs and XPINNs, we first provide a prior generalization bound via the complexity of the target functions in the PDE problem, and a posterior generalization bound via the posterior matrix norms of the networks after optimization. Moreover, based on our bounds, we analyze the conditions under which XPINNs improve generalization. Concretely, our theory shows that the key building block of XPINN, namely the domain decomposition, introduces a tradeoff for generalization. On the one hand, XPINNs decompose the complex PDE solution into several simple parts, which decreases the complexity needed to learn each part and boosts generalization. On the other hand, decomposition leads to less training data being available in each subdomain, and hence such model is typically prone to overfitting and may become less generalizable. Empirically, we choose five PDEs to show when XPINNs perform better than, similar to, or worse than PINNs, hence demonstrating and justifying our new theory.
Some aspects of nonlocal dynamics on directed and undirected networks for an initial value problem whose Jacobian matrix is a variable-order fractional power of a Laplacian matrix are discussed here. Both directed and undirected graphs are considered. Under appropriate assumptions, the existence, uniqueness, and uniform asymptotic stability of the solutions of the underlying initial value problem are proved. Some examples giving a sample of the behavior of the dynamics are also included.
Bayesian learning via Stochastic Gradient Langevin Dynamics (SGLD) has been suggested for differentially private learning. While previous research provides differential privacy bounds for SGLD when close to convergence or at the initial steps of the algorithm, the question of what differential privacy guarantees can be made in between remains unanswered. This interim region is essential, especially for Bayesian neural networks, as it is hard to guarantee convergence to the posterior. This paper will show that using SGLD might result in unbounded privacy loss for this interim region, even when sampling from the posterior is as differentially private as desired.
In this article, I introduce the differential equation model and review their frequentist and Bayesian computation methods. A numerical example of the FitzHugh-Nagumo model is given.
We provide a control-theoretic perspective on optimal tensor algorithms for minimizing a convex function in a finite-dimensional Euclidean space. Given a function $\Phi: \mathbb{R}^d \rightarrow \mathbb{R}$ that is convex and twice continuously differentiable, we study a closed-loop control system that is governed by the operators $\nabla \Phi$ and $\nabla^2 \Phi$ together with a feedback control law $\lambda(\cdot)$ satisfying the algebraic equation $(\lambda(t))^p\|\nabla\Phi(x(t))\|^{p-1} = \theta$ for some $\theta \in (0, 1)$. Our first contribution is to prove the existence and uniqueness of a local solution to this system via the Banach fixed-point theorem. We present a simple yet nontrivial Lyapunov function that allows us to establish the existence and uniqueness of a global solution under certain regularity conditions and analyze the convergence properties of trajectories. The rate of convergence is $O(1/t^{(3p+1)/2})$ in terms of objective function gap and $O(1/t^{3p})$ in terms of squared gradient norm. Our second contribution is to provide two algorithmic frameworks obtained from discretization of our continuous-time system, one of which generalizes the large-step A-HPE framework and the other of which leads to a new optimal $p$-th order tensor algorithm. While our discrete-time analysis can be seen as a simplification and generalization of~\citet{Monteiro-2013-Accelerated}, it is largely motivated by the aforementioned continuous-time analysis, demonstrating the fundamental role that the feedback control plays in optimal acceleration and the clear advantage that the continuous-time perspective brings to algorithmic design. A highlight of our analysis is that we show that all of the $p$-th order optimal tensor algorithms that we discuss minimize the squared gradient norm at a rate of $O(k^{-3p})$, which complements the recent analysis.
Solving for detailed chemical kinetics remains one of the major bottlenecks for computational fluid dynamics simulations of reacting flows using a finite-rate-chemistry approach. This has motivated the use of fully connected artificial neural networks to predict stiff chemical source terms as functions of the thermochemical state of the combustion system. However, due to the nonlinearities and multi-scale nature of combustion, the predicted solution often diverges from the true solution when these deep learning models are coupled with a computational fluid dynamics solver. This is because these approaches minimize the error during training without guaranteeing successful integration with ordinary differential equation solvers. In the present work, a novel neural ordinary differential equations approach to modeling chemical kinetics, termed as ChemNODE, is developed. In this deep learning framework, the chemical source terms predicted by the neural networks are integrated during training, and by computing the required derivatives, the neural network weights are adjusted accordingly to minimize the difference between the predicted and ground-truth solution. A proof-of-concept study is performed with ChemNODE for homogeneous autoignition of hydrogen-air mixture over a range of composition and thermodynamic conditions. It is shown that ChemNODE accurately captures the correct physical behavior and reproduces the results obtained using the full chemical kinetic mechanism at a fraction of the computational cost.
Data-driven methods are becoming an essential part of computational mechanics due to their unique advantages over traditional material modeling. Deep neural networks are able to learn complex material response without the constraints of closed-form approximations. However, imposing the physics-based mathematical requirements that any material model must comply with is not straightforward for data-driven approaches. In this study, we use a novel class of neural networks, known as neural ordinary differential equations (N-ODEs), to develop data-driven material models that automatically satisfy polyconvexity of the strain energy function with respect to the deformation gradient, a condition needed for the existence of minimizers for boundary value problems in elasticity. We take advantage of the properties of ordinary differential equations to create monotonic functions that approximate the derivatives of the strain energy function with respect to the invariants of the right Cauchy-Green deformation tensor. The monotonicity of the derivatives guarantees the convexity of the energy. The N-ODE material model is able to capture synthetic data generated from closed-form material models, and it outperforms conventional models when tested against experimental data on skin, a highly nonlinear and anisotropic material. We also showcase the use of the N-ODE material model in finite element simulations. The framework is general and can be used to model a large class of materials. Here we focus on hyperelasticity, but polyconvex strain energies are a core building block for other problems in elasticity such as viscous and plastic deformations. We therefore expect our methodology to further enable data-driven methods in computational mechanics
This paper introduces a new model to learn graph neural networks equivariant to rotations, translations, reflections and permutations called E(n)-Equivariant Graph Neural Networks (EGNNs). In contrast with existing methods, our work does not require computationally expensive higher-order representations in intermediate layers while it still achieves competitive or better performance. In addition, whereas existing methods are limited to equivariance on 3 dimensional spaces, our model is easily scaled to higher-dimensional spaces. We demonstrate the effectiveness of our method on dynamical systems modelling, representation learning in graph autoencoders and predicting molecular properties.
We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a black-box differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly trade numerical precision for speed. We demonstrate these properties in continuous-depth residual networks and continuous-time latent variable models. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions. For training, we show how to scalably backpropagate through any ODE solver, without access to its internal operations. This allows end-to-end training of ODEs within larger models.