Differential equations are indispensable to engineering and hence to innovation. In recent years, physics-informed neural networks (PINN) have emerged as a novel method for solving differential equations. PINN method has the advantage of being meshless, scalable, and can potentially be intelligent in terms of transferring the knowledge learned from solving one differential equation to the other. The exploration in this field has majorly been limited to solving linear-elasticity problems, crack propagation problems. This study uses PINNs to solve coupled thermo-mechanics problems of materials with functionally graded properties. An in-depth analysis of the PINN framework has been carried out by understanding the training datasets, model architecture, and loss functions. The efficacy of the PINN models in solving thermo-mechanics differential equations has been measured by comparing the obtained solutions either with analytical solutions or finite element method-based solutions. While R2 score of more than 99% has been achieved in predicting primary variables such as displacement and temperature fields, achieving the same for secondary variables such as stress turns out to be more challenging. This study is the first to implement the PINN framework for solving coupled thermo-mechanics problems on composite materials. This study is expected to enhance the understanding of the novel PINN framework and will be seminal for further research on PINNs.
We study the theory of neural network (NN) from the lens of classical nonparametric regression problems with a focus on NN's ability to adaptively estimate functions with heterogeneous smoothness --- a property of functions in Besov or Bounded Variation (BV) classes. Existing work on this problem requires tuning the NN architecture based on the function spaces and sample sizes. We consider a "Parallel NN" variant of deep ReLU networks and show that the standard weight decay is equivalent to promoting the $\ell_p$-sparsity ($0<p<1$) of the coefficient vector of an end-to-end learned function bases, i.e., a dictionary. Using this equivalence, we further establish that by tuning only the weight decay, such Parallel NN achieves an estimation error arbitrarily close to the minimax rates for both the Besov and BV classes. Notably, it gets exponentially closer to minimax optimal as the NN gets deeper. Our research sheds new lights on why depth matters and how NNs are more powerful than kernel methods.
We consider the Cauchy problem for the Helmholtz equation with a domain in R^d, d>2 with N cylindrical outlets to infinity with bounded inclusions in R^{d-1}. Cauchy data are prescribed on the boundary of the bounded domains and the aim is to find solution on the unbounded part of the boundary. In 1989, Kozlov and Maz'ya proposed an alternating iterative method for solving Cauchy problems associated with elliptic,self-adjoint and positive-definite operators in bounded domains. Different variants of this method for solving Cauchy problems associated with Helmholtz-type operators exists. We consider the variant proposed by Mpinganzima et al. for bounded domains and derive the necessary conditions for the convergence of the procedure in unbounded domains. For the numerical implementation, a finite difference method is used to solve the problem in a simple rectangular domain in R^2 that represent a truncated infinite strip. The numerical results shows that by appropriate truncation of the domain and with appropriate choice of the Robin parameters, the Robin-Dirichlet alternating iterative procedure is convergent.
In this paper we present a deep learning method to predict the temporal evolution of dissipative dynamic systems. We propose using both geometric and thermodynamic inductive biases to improve accuracy and generalization of the resulting integration scheme. The first is achieved with Graph Neural Networks, which induces a non-Euclidean geometrical prior with permutation invariant node and edge update functions. The second bias is forced by learning the GENERIC structure of the problem, an extension of the Hamiltonian formalism, to model more general non-conservative dynamics. Several examples are provided in both Eulerian and Lagrangian description in the context of fluid and solid mechanics respectively, achieving relative mean errors of less than 3% in all the tested examples. Two ablation studies are provided based on recent works in both physics-informed and geometric deep learning.
Molecular dynamics (MD) has long been the \emph{de facto} choice for modeling complex atomistic systems from first principles, and recently deep learning become a popular way to accelerate it. Notwithstanding, preceding approaches depend on intermediate variables such as the potential energy or force fields to update atomic positions, which requires additional computations to perform back-propagation. To waive this requirement, we propose a novel model called ScoreMD by directly estimating the gradient of the log density of molecular conformations. Moreover, we analyze that diffusion processes highly accord with the principle of enhanced sampling in MD simulations, and is therefore a perfect match to our sequential conformation generation task. That is, ScoreMD perturbs the molecular structure with a conditional noise depending on atomic accelerations and employs conformations at previous timeframes as the prior distribution for sampling. Another challenge of modeling such a conformation generation process is that the molecule is kinetic instead of static, which no prior studies strictly consider. To solve this challenge, we introduce a equivariant geometric Transformer as a score function in the diffusion process to calculate the corresponding gradient. It incorporates the directions and velocities of atomic motions via 3D spherical Fourier-Bessel representations. With multiple architectural improvements, we outperforms state-of-the-art baselines on MD17 and isomers of C7O2H10. This research provides new insights into the acceleration of new material and drug discovery.
Molecular mechanics (MM) potentials have long been a workhorse of computational chemistry. Leveraging accuracy and speed, these functional forms find use in a wide variety of applications in biomolecular modeling and drug discovery, from rapid virtual screening to detailed free energy calculations. Traditionally, MM potentials have relied on human-curated, inflexible, and poorly extensible discrete chemical perception rules or applying parameters to small molecules or biopolymers, making it difficult to optimize both types and parameters to fit quantum chemical or physical property data. Here, we propose an alternative approach that uses graph neural networks to perceive chemical environments, producing continuous atom embeddings from which valence and nonbonded parameters can be predicted using invariance-preserving layers. Since all stages are built from smooth neural functions, the entire process is modular and end-to-end differentiable with respect to model parameters, allowing new force fields to be easily constructed, extended, and applied to arbitrary molecules. We show that this approach is not only sufficiently expressive to reproduce legacy atom types, but that it can learn to accurately reproduce and extend existing molecular mechanics force fields. Trained with arbitrary loss functions, it can construct entirely new force fields self-consistently applicable to both biopolymers and small molecules directly from quantum chemical calculations, with superior fidelity than traditional atom or parameter typing schemes. When trained on the same quantum chemical small molecule dataset used to parameterize the openff-1.2.0 small molecule force field augmented with a peptide dataset, the resulting espaloma model shows superior accuracy vis-\`a-vis experiments in computing relative alchemical free energy calculations for a popular benchmark set.
Over the years, many graph problems specifically those in NP-complete are studied by a wide range of researchers. Some famous examples include graph colouring, travelling salesman problem and subgraph isomorphism. Most of these problems are typically addressed by exact algorithms, approximate algorithms and heuristics. There are however some drawback for each of these methods. Recent studies have employed learning-based frameworks such as machine learning techniques in solving these problems, given that they are useful in discovering new patterns in structured data that can be represented using graphs. This research direction has successfully attracted a considerable amount of attention. In this survey, we provide a systematic review mainly on classic graph problems in which learning-based approaches have been proposed in addressing the problems. We discuss the overview of each framework, and provide analyses based on the design and performance of the framework. Some potential research questions are also suggested. Ultimately, this survey gives a clearer insight and can be used as a stepping stone to the research community in studying problems in this field.
Multigrid is a powerful solver for large-scale linear systems arising from discretized partial differential equations. The convergence theory of multigrid methods for symmetric positive definite problems has been well developed over the past decades, while, for nonsymmetric problems, such theory is still not mature. As a foundation for multigrid analysis, two-grid convergence theory plays an important role in motivating multigrid algorithms. Regarding two-grid methods for nonsymmetric problems, most previous works focus on the spectral radius of iteration matrix or rely on convergence measures that are typically difficult to compute in practice. Moreover, the existing results are confined to two-grid methods with exact solution of the coarse-grid system. In this paper, we analyze the convergence of a two-grid method for nonsymmetric positive definite problems (e.g., linear systems arising from the discretizations of convection-diffusion equations). In the case of exact coarse solver, we establish an elegant identity for characterizing two-grid convergence factor, which is measured by a smoother-induced norm. The identity can be conveniently used to derive a class of optimal restriction operators and analyze how the convergence factor is influenced by restriction. More generally, we present some convergence estimates for an inexact variant of the two-grid method, in which both linear and nonlinear coarse solvers are considered.
We design the helicity-conservative physics-informed neural network model for the Navier-Stokes equation in the ideal case. The key is to provide an appropriate PDE model as loss function so that its neural network solutions produce helicity conservation. Physics-informed neural network model is based on the strong form of PDE. We show that the relevant helicity-conservative finite element method based on the weak formulation of PDE can be somewhat different. More precisely, we compares the PINN formulation and the finite element method based on the weak formulation for conserving helicity and argues that for the conservation, strong PDE is more natural. Our result is justified by theory as well. Furthermore, a couple of numerical calculations are demonstrated to confirm our theoretical finding.
The Monge-Amp\`ere equation is a fully nonlinear partial differential equation (PDE) of fundamental importance in analysis, geometry and in the applied sciences. In this paper we solve the Dirichlet problem associated with the Monge-Amp\`ere equation using neural networks and we show that an ansatz using deep input convex neural networks can be used to find the unique convex solution. As part of our analysis we study the effect of singularities, discontinuities and noise in the source function, we consider nontrivial domains, and we investigate how the method performs in higher dimensions. We also compare this method to an alternative approach in which standard feed-forward networks are used together with a loss function which penalizes lack of convexity.
This book develops an effective theory approach to understanding deep neural networks of practical relevance. Beginning from a first-principles component-level picture of networks, we explain how to determine an accurate description of the output of trained networks by solving layer-to-layer iteration equations and nonlinear learning dynamics. A main result is that the predictions of networks are described by nearly-Gaussian distributions, with the depth-to-width aspect ratio of the network controlling the deviations from the infinite-width Gaussian description. We explain how these effectively-deep networks learn nontrivial representations from training and more broadly analyze the mechanism of representation learning for nonlinear models. From a nearly-kernel-methods perspective, we find that the dependence of such models' predictions on the underlying learning algorithm can be expressed in a simple and universal way. To obtain these results, we develop the notion of representation group flow (RG flow) to characterize the propagation of signals through the network. By tuning networks to criticality, we give a practical solution to the exploding and vanishing gradient problem. We further explain how RG flow leads to near-universal behavior and lets us categorize networks built from different activation functions into universality classes. Altogether, we show that the depth-to-width ratio governs the effective model complexity of the ensemble of trained networks. By using information-theoretic techniques, we estimate the optimal aspect ratio at which we expect the network to be practically most useful and show how residual connections can be used to push this scale to arbitrary depths. With these tools, we can learn in detail about the inductive bias of architectures, hyperparameters, and optimizers.