Physics-informed neural networks and operator networks have shown promise for effectively solving equations modeling physical systems. However, these networks can be difficult or impossible to train accurately for some systems of equations. We present a novel multifidelity framework for stacking physics-informed neural networks and operator networks that facilitates training. We successively build a chain of networks, where the output at one step can act as a low-fidelity input for training the next step, gradually increasing the expressivity of the learned model. The equations imposed at each step of the iterative process can be the same or different (akin to simulated annealing). The iterative (stacking) nature of the proposed method allows us to progressively learn features of a solution that are hard to learn directly. Through benchmark problems including a nonlinear pendulum, the wave equation, and the viscous Burgers equation, we show how stacking can be used to improve the accuracy and reduce the required size of physics-informed neural networks and operator networks.
In this article, we propose an interval constraint programming method for globally solving catalog-based categorical optimization problems. It supports catalogs of arbitrary size and properties of arbitrary dimension, and does not require any modeling effort from the user. A novel catalog-based contractor (or filtering operator) guarantees consistency between the categorical properties and the existing catalog items. This results in an intuitive and generic approach that is exact, rigorous (robust to roundoff errors) and can be easily implemented in an off-the-shelf interval-based continuous solver that interleaves branching and constraint propagation. We demonstrate the validity of the approach on a numerical problem in which a categorical variable is described by a two-dimensional property space. A Julia prototype is available as open-source software under the MIT license at //github.com/cvanaret/CateGOrical.jl
By conceiving physical systems as 3D many-body point clouds, geometric graph neural networks (GNNs), such as SE(3)/E(3) equivalent GNNs, have showcased promising performance. In particular, their effective message-passing mechanics make them adept at modeling molecules and crystalline materials. However, current geometric GNNs only offer a mean-field approximation of the many-body system, encapsulated within two-body message passing, thus falling short in capturing intricate relationships within these geometric graphs. To address this limitation, tensor networks, widely employed by computational physics to handle manybody systems using high-order tensors, have been introduced. Nevertheless, integrating these tensorized networks into the message-passing framework of GNNs faces scalability and symmetry conservation (e.g., permutation and rotation) challenges. In response, we introduce an innovative equivariant Matrix Product State (MPS)-based message-passing strategy, through achieving an efficient implementation of the tensor contraction operation. Our method effectively models complex many-body relationships, suppressing mean-field approximations, and captures symmetries within geometric graphs. Importantly, it seamlessly replaces the standard message-passing and layer-aggregation modules intrinsic to geometric GNNs. We empirically validate the superior accuracy of our approach on benchmark tasks, including predicting classical Newton systems and quantum tensor Hamiltonian matrices. To our knowledge, our approach represents the inaugural utilization of parameterized geometric tensor networks.
Starting from the Kirchhoff-Huygens representation and Duhamel's principle of time-domain wave equations, we propose novel butterfly-compressed Hadamard integrators for self-adjoint wave equations in both time and frequency domain in an inhomogeneous medium. First, we incorporate the leading term of Hadamard's ansatz into the Kirchhoff-Huygens representation to develop a short-time valid propagator. Second, using the Fourier transform in time, we derive the corresponding Eulerian short-time propagator in frequency domain; on top of this propagator, we further develop a time-frequency-time (TFT) method for the Cauchy problem of time-domain wave equations. Third, we further propose the time-frequency-time-frequency (TFTF) method for the corresponding point-source Helmholtz equation, which provides Green's functions of the Helmholtz equation for all angular frequencies within a given frequency band. Fourth, to implement TFT and TFTF methods efficiently, we introduce butterfly algorithms to compress oscillatory integral kernels at different frequencies. As a result, the proposed methods can construct wave field beyond caustics implicitly and advance spatially overturning waves in time naturally with quasi-optimal computational complexity and memory usage. Furthermore, once constructed the Hadamard integrators can be employed to solve both time-domain wave equations with various initial conditions and frequency-domain wave equations with different point sources. Numerical examples for two-dimensional wave equations illustrate the accuracy and efficiency of the proposed methods.
Pseudo-Hamiltonian neural networks (PHNN) were recently introduced for learning dynamical systems that can be modelled by ordinary differential equations. In this paper, we extend the method to partial differential equations. The resulting model is comprised of up to three neural networks, modelling terms representing conservation, dissipation and external forces, and discrete convolution operators that can either be learned or be given as input. We demonstrate numerically the superior performance of PHNN compared to a baseline model that models the full dynamics by a single neural network. Moreover, since the PHNN model consists of three parts with different physical interpretations, these can be studied separately to gain insight into the system, and the learned model is applicable also if external forces are removed or changed.
Physics-informed neural networks (PINNs) and their variants have recently emerged as alternatives to traditional partial differential equation (PDE) solvers, but little literature has focused on devising accurate numerical integration methods for neural networks (NNs), which is essential for getting accurate solutions. In this work, we propose adaptive quadratures for the accurate integration of neural networks and apply them to loss functions appearing in low-dimensional PDE discretisations. We show that at opposite ends of the spectrum, continuous piecewise linear (CPWL) activation functions enable one to bound the integration error, while smooth activations ease the convergence of the optimisation problem. We strike a balance by considering a CPWL approximation of a smooth activation function. The CPWL activation is used to obtain an adaptive decomposition of the domain into regions where the network is almost linear, and we derive an adaptive global quadrature from this mesh. The loss function is then obtained by evaluating the smooth network (together with other quantities, e.g., the forcing term) at the quadrature points. We propose a method to approximate a class of smooth activations by CPWL functions and show that it has a quadratic convergence rate. We then derive an upper bound for the overall integration error of our proposed adaptive quadrature. The benefits of our quadrature are evaluated on a strong and weak formulation of the Poisson equation in dimensions one and two. Our numerical experiments suggest that compared to Monte-Carlo integration, our adaptive quadrature makes the convergence of NNs quicker and more robust to parameter initialisation while needing significantly fewer integration points and keeping similar training times.
The effect that different police protest management methods have on protesters' physical and mental trauma is still not well understood and is a matter of debate. In this paper, we take a two-pronged approach to gain insight into this issue. First, we perform statistical analysis on time series data of protests provided by ACLED and spanning the period of time from January 1, 2020, until March 13, 2021. We observe that the use of kinetic impact projectiles is associated with more protests in subsequent days and is also a better predictor of the number of deaths in subsequent deaths than the number of protests, concluding that the use of non-lethal weapons seems to have an inflammatory rather than suppressive effect on protests. Next, we provide a mathematical framework to model modern, but well-established psychological and sociological research on compliance theory and crowd dynamics. Our results show that understanding the heterogeneity of the crowd is key for protests that lead to a reduction of social tension and minimization of physical and mental trauma in protesters.
Multiscale stochastic dynamical systems have been widely adopted to a variety of scientific and engineering problems due to their capability of depicting complex phenomena in many real world applications. This work is devoted to investigating the effective dynamics for slow-fast stochastic dynamical systems. Given observation data on a short-term period satisfying some unknown slow-fast stochastic systems, we propose a novel algorithm including a neural network called Auto-SDE to learn invariant slow manifold. Our approach captures the evolutionary nature of a series of time-dependent autoencoder neural networks with the loss constructed from a discretized stochastic differential equation. Our algorithm is also validated to be accurate, stable and effective through numerical experiments under various evaluation metrics.
Detecting early warning indicators for abrupt dynamical transitions in complex systems or high-dimensional observation data is essential in many real-world applications, such as brain diseases, natural disasters, financial crises, and engineering reliability. To this end, we develop a novel approach: the directed anisotropic diffusion map that captures the latent evolutionary dynamics in the low-dimensional manifold. Then three effective warning signals (Onsager-Machlup Indicator, Sample Entropy Indicator, and Transition Probability Indicator) are derived through the latent coordinates and the latent stochastic dynamical systems. To validate our framework, we apply this methodology to authentic electroencephalogram (EEG) data. We find that our early warning indicators are capable of detecting the tipping point during state transition. This framework not only bridges the latent dynamics with real-world data but also shows the potential ability for automatic labeling on complex high-dimensional time series.
We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.
Graph representation learning for hypergraphs can be used to extract patterns among higher-order interactions that are critically important in many real world problems. Current approaches designed for hypergraphs, however, are unable to handle different types of hypergraphs and are typically not generic for various learning tasks. Indeed, models that can predict variable-sized heterogeneous hyperedges have not been available. Here we develop a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes. We perform extensive evaluations on multiple datasets, including four benchmark network datasets and two single-cell Hi-C datasets in genomics. We demonstrate that Hyper-SAGNN significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification. Hyper-SAGNN will be useful for graph representation learning to uncover complex higher-order interactions in different applications.