In this paper we develop a neural network for the numerical simulation of time-dependent linear transport equations with diffusive scaling and uncertainties. The goal of the network is to resolve the computational challenges of curse-of-dimensionality and multiple scales of the problem. We first show that a standard Physics-Informed Neural Network (PINNs) fails to capture the multiscale nature of the problem, hence justifies the need to use Asymptotic-Preserving Neural Networks (APNNs). We show that not all classical AP formulations are fit for the neural network approach. We construct a micro-macro decomposition based neutral network, and also build in a mass conservation mechanism into the loss function, in order to capture the dynamic and multiscale nature of the solutions. Numerical examples are used to demonstrate the effectiveness of this APNNs.
Consider using the right-preconditioned GMRES (AB-GMRES) for obtaining the minimum-norm solution of inconsistent underdetermined systems of linear equations. Morikuni (Ph.D. thesis, 2013) showed that for some inconsistent and ill-conditioned problems, the iterates may diverge. This is mainly because the Hessenberg matrix in the GMRES method becomes very ill-conditioned so that the backward substitution of the resulting triangular system becomes numerically unstable. We propose a stabilized GMRES based on solving the normal equations corresponding to the above triangular system using the standard Cholesky decomposition. This has the effect of shifting upwards the tiny singular values of the Hessenberg matrix which lead to an inaccurate solution. We analyze why the method works. Numerical experiments show that the proposed method is robust and efficient, not only for applying AB-GMRES to underdetermined systems, but also for applying GMRES to severely ill-conditioned range-symmetric systems of linear equations.
In this article we develop the Constraint Energy Minimizing Generalized Multiscale Finite Element Method (CEM-GMsFEM) for elliptic partial differential equations with inhomogeneous Dirichlet, Neumann, and Robin boundary conditions, and the high contrast property emerges from the coefficients of elliptic operators and Robin boundary conditions. By careful construction of multiscale bases of the CEM-GMsFEM, we introduce two operators $\mathcal{D}^m$ and $\mathcal{N}^m$ which are used to handle inhomogeneous Dirichlet and Neumann boundary values and are also proved to converge independently of contrast ratios as enlarging oversampling regions. We provide a priori error estimate and show that oversampling layers are the key factor in controlling numerical errors. A series of experiments are conducted, and those results reflect the reliability of our methods even with high contrast ratios.
Understanding the fundamental mechanism behind the success of deep neural networks is one of the key challenges in the modern machine learning literature. Despite numerous attempts, a solid theoretical analysis is yet to be developed. In this paper, we develop a novel unified framework to reveal a hidden regularization mechanism through the lens of convex optimization. We first show that the training of multiple three-layer ReLU sub-networks with weight decay regularization can be equivalently cast as a convex optimization problem in a higher dimensional space, where sparsity is enforced via a group $\ell_1$-norm regularization. Consequently, ReLU networks can be interpreted as high dimensional feature selection methods. More importantly, we then prove that the equivalent convex problem can be globally optimized by a standard convex optimization solver with a polynomial-time complexity with respect to the number of samples and data dimension when the width of the network is fixed. Finally, we numerically validate our theoretical results via experiments involving both synthetic and real datasets.
We introduce a new numerical method for solving time-harmonic acoustic scattering problems. The main focus is on plane waves scattered by smoothly varying material inhomogeneities. The proposed method works for any frequency $\omega$, but is especially efficient for high-frequency problems. It is based on a time-domain approach and consists of three steps: \emph{i)} computation of a suitable incoming plane wavelet with compact support in the propagation direction; \emph{ii)} solving a scattering problem in the time domain for the incoming plane wavelet; \emph{iii)} reconstruction of the time-harmonic solution from the time-domain solution via a Fourier transform in time. An essential ingredient of the new method is a front-tracking mesh adaptation algorithm for solving the problem in \emph{ii)}. By exploiting the limited support of the wave front, this allows us to make the number of the required degrees of freedom to reach a given accuracy significantly less dependent on the frequency $\omega$. We also present a new algorithm for computing the Fourier transform in \emph{iii)} that exploits the reduced number of degrees of freedom corresponding to the adapted meshes. Numerical examples demonstrate the advantages of the proposed method and the fact that the method can also be applied with external source terms such as point sources and sound-soft scatterers. The gained efficiency, however, is limited in the presence of trapping modes.
In this work we present a novel bulk-surface virtual element method (BSVEM) for the numerical approximation of elliptic bulk-surface partial differential equations (BSPDEs) in three space dimensions. The BSVEM is based on the discretisation of the bulk domain into polyhedral elements with arbitrarily many faces. The polyhedral approximation of the bulk induces a polygonal approximation of the surface. Firstly, we present a geometric error analysis of bulk-surface polyhedral meshes independent of the numerical method. Then, we show that BSVEM has optimal second-order convergence in space, provided the exact solution is $H^{2+3/4}$ in the bulk and $H^2$ on the surface, where the additional $\frac{3}{4}$ is due to the combined effect of surface curvature and polyhedral elements close to the boundary. We show that general polyhedra can be exploited to reduce the computational time of the matrix assembly. To demonstrate optimal convergence results, a numerical example is presented on the unit sphere.
The magnetohydrodynamics (MHD) equations are continuum models used in the study of a wide range of plasma physics systems, including the evolution of complex plasma dynamics in tokamak disruptions. However, efficient numerical solution methods for MHD are extremely challenging due to disparate time and length scales, strong hyperbolic phenomena, and nonlinearity. Therefore the development of scalable, implicit MHD algorithms and high-resolution adaptive mesh refinement strategies is of considerable importance. In this work, we develop a high-order stabilized finite-element algorithm for the reduced visco-resistive MHD equations based on the MFEM finite element library (mfem.org). The scheme is fully implicit, solved with the Jacobian-free Newton-Krylov (JFNK) method with a physics-based preconditioning strategy. Our preconditioning strategy is a generalization of the physics-based preconditioning methods in [Chacon, et al, JCP 2002] to adaptive, stabilized finite elements. Algebraic multigrid methods are used to invert sub-block operators to achieve scalability. A parallel adaptive mesh refinement scheme with dynamic load-balancing is implemented to efficiently resolve the multi-scale spatial features of the system. Our implementation uses the MFEM framework, which provides arbitrary-order polynomials and flexible adaptive conforming and non-conforming meshes capabilities. Results demonstrate the accuracy, efficiency, and scalability of the implicit scheme in the presence of large scale disparity. The potential of the AMR approach is demonstrated on an island coalescence problem in the high Lundquist-number regime ($\ge 10^7$) with the successful resolution of plasmoid instabilities and thin current sheets.
We propose a test procedure to compare simultaneously $K$ copulas, with $K \geq 2$. The $K$ observed populations can be paired. The test statistic is based on the differences between orthogonal projection coefficients associated to the density copulas, that we called {\it copula coefficients}. The procedure is data driven and we obtain a chi-square asymptotic distribution of the test statistic under the null. We illustrate our procedure via numerical studies and through two real datasets. Eventually, a clustering algorithm is deduced from the $K$-sample test and its performances are illustrated in a simulation experiment.
Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks. In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge and may severely deteriorate the generalization performance. In this paper, we investigate and identify the limitation of several decentralized optimization algorithms for different degrees of data heterogeneity. We propose a novel momentum-based method to mitigate this decentralized training difficulty. We show in extensive empirical experiments on various CV/NLP datasets (CIFAR-10, ImageNet, and AG News) and several network topologies (Ring and Social Network) that our method is much more robust to the heterogeneity of clients' data than other existing methods, by a significant improvement in test performance ($1\% \!-\! 20\%$). Our code is publicly available.
We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a black-box differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly trade numerical precision for speed. We demonstrate these properties in continuous-depth residual networks and continuous-time latent variable models. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions. For training, we show how to scalably backpropagate through any ODE solver, without access to its internal operations. This allows end-to-end training of ODEs within larger models.
In this paper, we propose to tackle the problem of reducing discrepancies between multiple domains referred to as multi-source domain adaptation and consider it under the target shift assumption: in all domains we aim to solve a classification problem with the same output classes, but with labels' proportions differing across them. We design a method based on optimal transport, a theory that is gaining momentum to tackle adaptation problems in machine learning due to its efficiency in aligning probability distributions. Our method performs multi-source adaptation and target shift correction simultaneously by learning the class probabilities of the unlabeled target sample and the coupling allowing to align two (or more) probability distributions. Experiments on both synthetic and real-world data related to satellite image segmentation task show the superiority of the proposed method over the state-of-the-art.