Neural operators as novel neural architectures for fast approximating solution operators of partial differential equations (PDEs), have shown considerable promise for future scientific computing. However, the mainstream of training neural operators is still data-driven, which needs an expensive ground-truth dataset from various sources (e.g., solving PDEs' samples with the conventional solvers, real-world experiments) in addition to training stage costs. From a computational perspective, marrying operator learning and specific domain knowledge to solve PDEs is an essential step in reducing dataset costs and label-free learning. We propose a novel paradigm that provides a unified framework of training neural operators and solving PDEs with the variational form, which we refer to as the variational operator learning (VOL). Ritz and Galerkin approach with finite element discretization are developed for VOL to achieve matrix-free approximation of system functional and residual, then direct minimization and iterative update are proposed as two optimization strategies for VOL. Various types of experiments based on reasonable benchmarks about variable heat source, Darcy flow, and variable stiffness elasticity are conducted to demonstrate the effectiveness of VOL. With a label-free training set and a 5-label-only shift set, VOL learns solution operators with its test errors decreasing in a power law with respect to the amount of unlabeled data. To the best of the authors' knowledge, this is the first study that integrates the perspectives of the weak form and efficient iterative methods for solving sparse linear systems into the end-to-end operator learning task.
We study the optimal sample complexity of neighbourhood selection in linear structural equation models, and compare this to best subset selection (BSS) for linear models under general design. We show by example that -- even when the structure is \emph{unknown} -- the existence of underlying structure can reduce the sample complexity of neighbourhood selection. This result is complicated by the possibility of path cancellation, which we study in detail, and show that improvements are still possible in the presence of path cancellation. Finally, we support these theoretical observations with experiments. The proof introduces a modified BSS estimator, called klBSS, and compares its performance to BSS. The analysis of klBSS may also be of independent interest since it applies to arbitrary structured models, not necessarily those induced by a structural equation model. Our results have implications for structure learning in graphical models, which often relies on neighbourhood selection as a subroutine.
We present a new data-driven reduced-order modeling approach to efficiently solve parametrized partial differential equations (PDEs) for many-query problems. This work is inspired by the concept of implicit neural representation (INR), which models physics signals in a continuous manner and independent of spatial/temporal discretization. The proposed framework encodes PDE and utilizes a parametrized neural ODE (PNODE) to learn latent dynamics characterized by multiple PDE parameters. PNODE can be inferred by a hypernetwork to reduce the potential difficulties in learning PNODE due to a complex multilayer perceptron (MLP). The framework uses an INR to decode the latent dynamics and reconstruct accurate PDE solutions. Further, a physics-informed loss is also introduced to correct the prediction of unseen parameter instances. Incorporating the physics-informed loss also enables the model to be fine-tuned in an unsupervised manner on unseen PDE parameters. A numerical experiment is performed on a two-dimensional Burgers equation with a large variation of PDE parameters. We evaluate the proposed method at a large Reynolds number and obtain up to speedup of O(10^3) and ~1% relative error to the ground truth values.
Ordinary differential equations (ODEs) can provide mechanistic models of temporally local changes of processes, where parameters are often informed by external knowledge. While ODEs are popular in systems modeling, they are less established for statistical modeling of longitudinal cohort data, e.g., in a clinical setting. Yet, modeling of local changes could also be attractive for assessing the trajectory of an individual in a cohort in the immediate future given its current status, where ODE parameters could be informed by further characteristics of the individual. However, several hurdles so far limit such use of ODEs, as compared to regression-based function fitting approaches. The potentially higher level of noise in cohort data might be detrimental to ODEs, as the shape of the ODE solution heavily depends on the initial value. In addition, larger numbers of variables multiply such problems and might be difficult to handle for ODEs. To address this, we propose to use each observation in the course of time as the initial value to obtain multiple local ODE solutions and build a combined estimator of the underlying dynamics. Neural networks are used for obtaining a low-dimensional latent space for dynamic modeling from a potentially large number of variables, and for obtaining patient-specific ODE parameters from baseline variables. Simultaneous identification of dynamic models and of a latent space is enabled by recently developed differentiable programming techniques. We illustrate the proposed approach in an application with spinal muscular atrophy patients and a corresponding simulation study. In particular, modeling of local changes in health status at any point in time is contrasted to the interpretation of functions obtained from a global regression. This more generally highlights how different application settings might demand different modeling strategies.
Designing efficient and high-accuracy numerical methods for complex dynamic incompressible magnetohydrodynamics (MHD) equations remains a challenging problem in various analysis and design tasks. This is mainly due to the nonlinear coupling of the magnetic and velocity fields occurring with convection and Lorentz forces, and multiple physical constraints, which will lead to the limitations of numerical computation. In this paper, we develop the MHDnet as a physics-preserving learning approach to solve MHD problems, where three different mathematical formulations are considered and named $B$ formulation, $A_1$ formulation, and $A_2$ formulation. Then the formulations are embedded into the MHDnet that can preserve the underlying physical properties and divergence-free condition. Moreover, MHDnet is designed by the multi-modes feature merging with multiscale neural network architecture, which can accelerate the convergence of the neural networks (NN) by alleviating the interaction of magnetic fluid coupling across different frequency modes. Furthermore, the pressure fields of three formulations, as the hidden state, can be obtained without extra data and computational cost. Several numerical experiments are presented to demonstrate the performance of the proposed MHDnet compared with different NN architectures and numerical formulations.
We propose a second order exponential scheme suitable for two-component coupled systems of stiff evolutionary advection--diffusion--reaction equations in two and three space dimensions. It is based on a directional splitting of the involved matrix functions, which allows for a simple yet efficient implementation through the computation of small-sized exponential-like functions and tensor-matrix products. The procedure straightforwardly extends to the case of an arbitrary number of components and to any space dimension. Several numerical examples in 2D and 3D with physically relevant (advective) Schnakenberg, FitzHugh--Nagumo, DIB, and advective Brusselator models clearly show the advantage of the approach against state-of-the-art techniques.
For the numerical solution of the cubic nonlinear Schr\"{o}dinger equation with periodic boundary conditions, a pseudospectral method in space combined with a filtered Lie splitting scheme in time is considered. This scheme is shown to converge even for initial data with very low regularity. In particular, for data in $H^s(\mathbb T^2)$, where $s>0$, convergence of order $\mathcal O(\tau^{s/2}+N^{-s})$ is proved in $L^2$. Here $\tau$ denotes the time step size and $N$ the number of Fourier modes considered. The proof of this result is carried out in an abstract framework of discrete Bourgain spaces, the final convergence result, however, is given in $L^2$. The stated convergence behavior is illustrated by several numerical examples.
Neural operators have been explored as surrogate models for simulating physical systems to overcome the limitations of traditional partial differential equation (PDE) solvers. However, most existing operator learning methods assume that the data originate from a single physical mechanism, limiting their applicability and performance in more realistic scenarios. To this end, we propose Physical Invariant Attention Neural Operator (PIANO) to decipher and integrate the physical invariants (PI) for operator learning from the PDE series with various physical mechanisms. PIANO employs self-supervised learning to extract physical knowledge and attention mechanisms to integrate them into dynamic convolutional layers. Compared to existing techniques, PIANO can reduce the relative error by 13.6\%-82.2\% on PDE forecasting tasks across varying coefficients, forces, or boundary conditions. Additionally, varied downstream tasks reveal that the PI embeddings deciphered by PIANO align well with the underlying invariants in the PDE systems, verifying the physical significance of PIANO. The source code will be publicly available at: //github.com/optray/PIANO.
We introduce numerical solvers for the steady-state Boltzmann equation based on the symmetric Gauss-Seidel (SGS) method. Due to the quadratic collision operator in the Boltzmann equation, the SGS method requires solving a nonlinear system on each grid cell, and we consider two methods, namely Newton's method and the fixed-point iteration, in our numerical tests. For small Knudsen numbers, our method has an efficiency between the classical source iteration and the modern generalized synthetic iterative scheme, and the complexity of its implementation is closer to the source iteration. A variety of numerical tests are carried out to demonstrate its performance, and it is concluded that the proposed method is suitable for applications with moderate to large Knudsen numbers.
The main reason for query model's prominence in complexity theory and quantum computing is the presence of concrete lower bounding techniques: polynomial and adversary method. There have been considerable efforts to give lower bounds using these methods, and to compare/relate them with other measures based on the decision tree. We explore the value of these lower bounds on quantum query complexity and their relation with other decision tree based complexity measures for the class of symmetric functions, arguably one of the most natural and basic sets of Boolean functions. We show an explicit construction for the dual of the positive adversary method and also of the square root of private coin certificate game complexity for any total symmetric function. This shows that the two values can't be distinguished for any symmetric function. Additionally, we show that the recently introduced measure of spectral sensitivity gives the same value as both positive adversary and approximate degree for every total symmetric Boolean function. Further, we look at the quantum query complexity of Gap Majority, a partial symmetric function. It has gained importance recently in regard to understanding the composition of randomized query complexity. We characterize the quantum query complexity of Gap Majority and show a lower bound on noisy randomized query complexity (Ben-David and Blais, FOCS 2020) in terms of quantum query complexity. Finally, we study how large certificate complexity and block sensitivity can be as compared to sensitivity for symmetric functions (even up to constant factors). We show tight separations, i.e., give upper bounds on possible separations and construct functions achieving the same.
Numerical methods for computing the solutions of Markov backward stochastic differential equations (BSDEs) driven by continuous-time Markov chains (CTMCs) are explored. The main contributions of this paper are as follows: (1) we observe that Euler-Maruyama temporal discretization methods for solving Markov BSDEs driven by CTMCs are equivalent to exponential integrators for solving the associated systems of ordinary differential equations (ODEs); (2) we introduce multi-stage Euler-Maruyama methods for effectively solving "stiff" Markov BSDEs driven by CTMCs; these BSDEs typically arise from the spatial discretization of Markov BSDEs driven by Brownian motion; (3) we propose a multilevel spatial discretization method on sparse grids that efficiently approximates high-dimensional Markov BSDEs driven by Brownian motion with a combination of multiple Markov BSDEs driven by CTMCs on grids with different resolutions. We also illustrate the effectiveness of the presented methods with a number of numerical experiments in which we treat nonlinear BSDEs arising from option pricing problems in finance.