In this paper, we propose a data-driven model reduction method to solve parabolic inverse source problems efficiently. Our method consists of offline and online stages. In the off-line stage, we explore the low-dimensional structures in the solution space of the parabolic partial differential equations (PDEs) in the forward problem with a given class of source functions and construct a small number of proper orthogonal decomposition (POD) basis functions to achieve significant dimension reduction. Equipped with the POD basis functions, we can solve the forward problem extremely fast in the online stage. Thus, we develop a fast algorithm to solve the optimization problem in the parabolic inverse source problems, which is referred to as the POD algorithm in this paper. Under a weak regularity assumption on the solution of the parabolic PDEs, we prove the convergence of the POD algorithm in solving the forward parabolic PDEs. In addition, we obtain the error estimate of the POD algorithm for parabolic inverse source problems. Finally, we present numerical examples to demonstrate the accuracy and efficiency of the proposed method. Our numerical results show that the POD algorithm provides considerable computational savings over the finite element method.
Randomization has shown catalyzing effects in linear algebra with promising perspectives for tackling computational challenges in large-scale problems. For solving a system of linear equations, we demonstrate the convergence of a broad class of algorithms that at each step pick a subset of $n$ equations at random and update the iterate with the orthogonal projection to the subspace those equations represent. We identify, in this context, a specific degree-$n$ polynomial that non-linearly transforms the singular values of the system towards equalization. This transformation to singular values and the corresponding condition number then characterizes the expected convergence rate of iterations. As a consequence, our results specify the convergence rate of the stochastic gradient descent algorithm, in terms of the mini-batch size $n$, when used for solving systems of linear equations.
This article considers the extension of two-grid $hp$-version discontinuous Galerkin finite element methods for the numerical approximation of second-order quasilinear elliptic boundary value problems of monotone type to the case when agglomerated polygonal/polyhedral meshes are employed for the coarse mesh approximation. We recall that within the two-grid setting, while it is necessary to solve a nonlinear problem on the coarse approximation space, only a linear problem must be computed on the original fine finite element space. In this article, the coarse space will be constructed by agglomerating elements from the original fine mesh. Here, we extend the existing a priori and a posteriori error analysis for the two-grid $hp$-version discontinuous Galerkin finite element method from 10.1007/s10915-012-9644-1 for coarse meshes consisting of standard element shapes to include arbitrarily agglomerated coarse grids. Moreover, we develop an $hp$-adaptive two-grid algorithm to adaptively design the fine and coarse finite element spaces; we stress that this is undertaken in a fully automatic manner, and hence can be viewed as blackbox solver. Numerical experiments are presented for two- and three-dimensional problems to demonstrate the computational performance of the proposed $hp$-adaptive two-grid method.
Performing exact Bayesian inference for complex models is computationally intractable. Markov chain Monte Carlo (MCMC) algorithms can provide reliable approximations of the posterior distribution but are expensive for large datasets and high-dimensional models. A standard approach to mitigate this complexity consists in using subsampling techniques or distributing the data across a cluster. However, these approaches are typically unreliable in high-dimensional scenarios. We focus here on a recent alternative class of MCMC schemes exploiting a splitting strategy akin to the one used by the celebrated alternating direction of multipliers (ADMM) optimization algorithm. These methods appear to provide empirically state-of-the-art performance but their theoretical behavior in high dimension is currently unknown. In this paper, we propose a detailed theoretical study of one of these algorithms known as the split Gibbs sampler. Under regularity conditions, we establish explicit convergence rates for this scheme using Ricci curvature and coupling ideas. We support our theory with numerical illustrations.
Finite element approximation to a decoupled formulation for the quad--curl problem is studied in this paper. The difficulty of constructing elements with certain conformity to the quad--curl problems has been greatly reduced. For convex domains, where the regularity assumption holds for Stokes equation, the approximation to the curl of the true solution has quadratic order of convergence and first order for the energy norm. If the solution shows singularity, an a posterior error estimator is developed and a separate marking adaptive finite element procedure is proposed, together with its convergence proved. Both the a priori and a posteriori error analysis are supported by the numerical examples.
A solution manifold is the collection of points in a $d$-dimensional space satisfying a system of $s$ equations with $s<d$. Solution manifolds occur in several statistical problems including hypothesis testing, curved-exponential families, constrained mixture models, partial identifications, and nonparametric set estimation. We analyze solution manifolds both theoretically and algorithmically. In terms of theory, we derive five useful results: the smoothness theorem, the stability theorem (which implies the consistency of a plug-in estimator), the convergence of a gradient flow, the local center manifold theorem and the convergence of the gradient descent algorithm. To numerically approximate a solution manifold, we propose a Monte Carlo gradient descent algorithm. In the case of likelihood inference, we design a manifold constraint maximization procedure to find the maximum likelihood estimator on the manifold. We also develop a method to approximate a posterior distribution defined on a solution manifold.
Since its invention in 2014, the Adam optimizer has received tremendous attention. On one hand, it has been widely used in deep learning and many variants have been proposed, while on the other hand their theoretical convergence property remains to be a mystery. It is far from satisfactory in the sense that some studies require strong assumptions about the updates, which are not necessarily applicable in practice, while other studies still follow the original problematic convergence analysis of Adam, which was shown to be not sufficient to ensure convergence. Although rigorous convergence analysis exists for Adam, they impose specific requirements on the update of the adaptive step size, which are not generic enough to cover many other variants of Adam. To address theses issues, in this extended abstract, we present a simple and generic proof of convergence for a family of Adam-style methods (including Adam, AMSGrad, Adabound, etc.). Our analysis only requires an increasing or large "momentum" parameter for the first-order moment, which is indeed the case used in practice, and a boundness condition on the adaptive factor of the step size, which applies to all variants of Adam under mild conditions of stochastic gradients. We also establish a variance diminishing result for the used stochastic gradient estimators. Indeed, our analysis of Adam is so simple and generic that it can be leveraged to establish the convergence for solving a broader family of non-convex optimization problems, including min-max, compositional, and bilevel optimization problems. For the full (earlier) version of this extended abstract, please refer to arXiv:2104.14840.
We investigate data-driven forward-inverse problems for Yajima-Oikawa system by employing two technologies which improve the performance of PINN in deep physics-informed neural network (PINN), namely neuron-wise locally adaptive activation functions and L2 norm parameter regularization. In particular, we not only recover three different forms of vector rogue waves (RWs) in the forward problem of Yajima-Oikawa (YO) system, including bright-bright RWs, intermediatebright RWs and dark-bright RWs, but also study the inverse problem of YO system by data-driven with noise of different intensity. Compared with PINN method using only locally adaptive activation function, the PINN method with two strategies shows amazing robustness when studying the inverse problem of YO system with noisy training data, that is, the improved PINN model proposed by us has excellent noise immunity. The asymptotic analysis of wavenumber k and the MI analysis for YO system with unknown parameters are derived systematically by applying the linearized instability analysis on plane wave.
Linear kinetic transport equations play a critical role in optical tomography, radiative transfer and neutron transport. The fundamental difficulty hampering their efficient and accurate numerical resolution lies in the high dimensionality of the physical and velocity/angular variables and the fact that the problem is multiscale in nature. Leveraging the existence of a hidden low-rank structure hinted by the diffusive limit, in this work, we design and test the angular-space reduced order model for the linear radiative transfer equation, the first such effort based on the celebrated reduced basis method (RBM). Our method is built upon a high-fidelity solver employing the discrete ordinates method in the angular space, an asymptotic preserving upwind discontinuous Galerkin method for the physical space, and an efficient synthetic accelerated source iteration for the resulting linear system. Addressing the challenge of the parameter values (or angular directions) being coupled through an integration operator, the first novel ingredient of our method is an iterative procedure where the macroscopic density is constructed from the RBM snapshots, treated explicitly and allowing a transport sweep, and then updated afterwards. A greedy algorithm can then proceed to adaptively select the representative samples in the angular space and form a surrogate solution space. The second novelty is a least-squares density reconstruction strategy, at each of the relevant physical locations, enabling the robust and accurate integration over an arbitrarily unstructured set of angular samples toward the macroscopic density. Numerical experiments indicate that our method is effective for computational cost reduction in a variety of regimes.
We introduce and analyze a lower envelope method (LEM) for the tracking of interfaces motion in multiphase problems. The main idea of the method is to define the phases as the regions where the lower envelope of a set of functions coincides with exactly one of the functions. We show that a variety of complex lower-dimensional interfaces naturally appear in the process. The phases evolution is then achieved by solving a set of transport equations. In the first part of the paper, we show several theoretical properties, give conditions to obtain a well-posed behaviour, and show that the level set method is a particular case of the LEM. In the second part, we propose a LEM-based numerical algorithm for multiphase shape optimization problems. We apply this algorithm to an inverse conductivity problem with three phases and present several numerical results.
For neural networks (NNs) with rectified linear unit (ReLU) or binary activation functions, we show that their training can be accomplished in a reduced parameter space. Specifically, the weights in each neuron can be trained on the unit sphere, as opposed to the entire space, and the threshold can be trained in a bounded interval, as opposed to the real line. We show that the NNs in the reduced parameter space are mathematically equivalent to the standard NNs with parameters in the whole space. The reduced parameter space shall facilitate the optimization procedure for the network training, as the search space becomes (much) smaller. We demonstrate the improved training performance using numerical examples.