We consider a general nonsymmetric second-order linear elliptic PDE in the framework of the Lax-Milgram lemma. We formulate and analyze an adaptive finite element algorithm with arbitrary polynomial degree that steers the adaptive mesh-refinement and the inexact iterative solution of the arising linear systems. More precisely, the iterative solver employs, as an outer loop, the so-called Zarantonello iteration to symmetrize the system and, as an inner loop, a uniformly contractive algebraic solver, e.g., an optimally preconditioned conjugate gradient method or an optimal geometric multigrid algorithm. We prove that the proposed inexact adaptive iteratively symmetrized finite element method (AISFEM) leads to full linear convergence and, for sufficiently small adaptivity parameters, to optimal convergence rates with respect to the overall computational cost, i.e., the total computational time. Numerical experiments underline the theory.
Entropic optimal transport (EOT) presents an effective and computationally viable alternative to unregularized optimal transport (OT), offering diverse applications for large-scale data analysis. In this work, we derive novel statistical bounds for empirical plug-in estimators of the EOT cost and show that their statistical performance in the entropy regularization parameter $\epsilon$ and the sample size $n$ only depends on the simpler of the two probability measures. For instance, under sufficiently smooth costs this yields the parametric rate $n^{-1/2}$ with factor $\epsilon^{-d/2}$, where $d$ is the minimum dimension of the two population measures. This confirms that empirical EOT also adheres to the lower complexity adaptation principle, a hallmark feature only recently identified for unregularized OT. As a consequence of our theory, we show that the empirical entropic Gromov-Wasserstein distance and its unregularized version for measures on Euclidean spaces also obey this principle. Additionally, we comment on computational aspects and complement our findings with Monte Carlo simulations. Our techniques employ empirical process theory and rely on a dual formulation of EOT over a single function class. Crucial to our analysis is the observation that the entropic cost-transformation of a function class does not increase its uniform metric entropy by much.
An implicit variable-step BDF2 scheme is established for solving the space fractional Cahn-Hilliard equation, involving the fractional Laplacian, derived from a gradient flow in the negative order Sobolev space $H^{-\alpha}$, $\alpha\in(0,1)$. The Fourier pseudo-spectral method is applied for the spatial approximation. The proposed scheme inherits the energy dissipation law in the form of the modified discrete energy under the sufficient restriction of the time-step ratios. The convergence of the fully discrete scheme is rigorously provided utilizing the newly proved discrete embedding type convolution inequality dealing with the fractional Laplacian. Besides, the mass conservation and the unique solvability are also theoretically guaranteed. Numerical experiments are carried out to show the accuracy and the energy dissipation both for various interface widths. In particular, the multiple-time-scale evolution of the solution is captured by an adaptive time-stepping strategy in the short-to-long time simulation.
Deep neural networks have received significant attention due to their simplicity and flexibility in the fields of engineering and scientific calculation. In this work, we probe into solving a class of elliptic PDEs with multiple scales by means of Fourier-based mixed physics-informed neural networks (called FMPINN), and its solver is configured as a multi-scale DNN model. Unlike the classical PINN method, a dual (flux) variable about the rough coefficient of PDEs is introduced to avoid the ill-condition of neural tangent kernel matrix that resulted from the oscillating coefficient of multi-scale PDEs. Therefore, apart from the physical conservation laws, the discrepancy between the auxiliary variables and the gradients of multi-scale coefficients is incorporated into the cost function, then leveraging the optimization method to yield the satisfactory solution of PDEs by minimizing the defined loss. Additionally, a novel trigonometric activation function is introduced for FMPINN, which is suited for representing the derivatives of complex target functions. Handling the input data by Fourier feature mapping will effectively improve the capacity of deep neural networks to solve high-frequency problems. Finally, by introducing several numerical examples of multi-scale problems in various dimensional Euclidean spaces, we validate the efficiency and robustness of the proposed FMPINN algorithm in both low-frequency and high-frequency oscillation cases.
We focus on analyzing the classical stochastic projected gradient methods under a general dependent data sampling scheme for constrained smooth nonconvex optimization. We show the worst-case rate of convergence $\tilde{O}(t^{-1/4})$ and complexity $\tilde{O}(\varepsilon^{-4})$ for achieving an $\varepsilon$-near stationary point in terms of the norm of the gradient of Moreau envelope and gradient mapping. While classical convergence guarantee requires i.i.d. data sampling from the target distribution, we only require a mild mixing condition of the conditional distribution, which holds for a wide class of Markov chain sampling algorithms. This improves the existing complexity for the constrained smooth nonconvex optimization with dependent data from $\tilde{O}(\varepsilon^{-8})$ to $\tilde{O}(\varepsilon^{-4})$ with a significantly simpler analysis. We illustrate the generality of our approach by deriving convergence results with dependent data for stochastic proximal gradient methods, adaptive stochastic gradient algorithm AdaGrad and stochastic gradient algorithm with heavy ball momentum. As an application, we obtain first online nonnegative matrix factorization algorithms for dependent data based on stochastic projected gradient methods with adaptive step sizes and optimal rate of convergence.
The equilibrium configuration of a plasma in an axially symmetric reactor is described mathematically by a free boundary problem associated with the celebrated Grad--Shafranov equation. The presence of uncertainty in the model parameters introduces the need to quantify the variability in the predictions. This is often done by computing a large number of model solutions on a computational grid for an ensemble of parameter values and then obtaining estimates for the statistical properties of solutions. In this study, we explore the savings that can be obtained using multilevel Monte Carlo methods, which reduce costs by performing the bulk of the computations on a sequence of spatial grids that are coarser than the one that would typically be used for a simple Monte Carlo simulation. We examine this approach using both a set of uniformly refined grids and a set of adaptively refined grids guided by a discrete error estimator. Numerical experiments show that multilevel methods dramatically reduce the cost of simulation, with cost reductions typically on the order of 60 or more and possibly as large as 200. Adaptive gridding results in more accurate computation of geometric quantities such as x-points associated with the model.
In this paper, we develop sixth-order hybrid finite difference methods (FDMs) for the elliptic interface problem $-\nabla \cdot( a\nabla u)=f$ in $\Omega\backslash \Gamma$, where $\Gamma$ is a smooth interface inside $\Omega$. The variable scalar coefficient $a>0$ and source $f$ are possibly discontinuous across $\Gamma$. The hybrid FDMs utilize a 9-point compact stencil at any interior regular point of the grid and a 13-point stencil at irregular points near $\Gamma$. For interior regular points away from $\Gamma$, we obtain a sixth-order 9-point compact FDM satisfying the M-matrix property. Consequently, for the elliptic problem without interface (i.e., $\Gamma$ is empty), our compact FDM satisfies the discrete maximum principle, which guarantees the theoretical sixth-order convergence. We also derive sixth-order compact (4-point for corners and 6-point for edges) FDMs having the M-matrix property at any boundary point subject to (mixed) Dirichlet/Neumann/Robin boundary conditions. For irregular points near $\Gamma$, we propose fifth-order 13-point FDMs, whose stencil coefficients can be effectively calculated by recursively solving several small linear systems. Theoretically, the proposed high order FDMs use high order (partial) derivatives of the coefficient $a$, the source term $f$, the interface curve $\Gamma$, the two jump functions along $\Gamma$, and the functions on $\partial \Omega$. Numerically, we always use function values to approximate all required high order (partial) derivatives in our hybrid FDMs without losing accuracy. Our proposed FDMs are independent of the choice representing $\Gamma$ and are also applicable if the jump conditions on $\Gamma$ only depend on the geometry (e.g., curvature) of the curve $\Gamma$. Our numerical experiments confirm the sixth-order convergence in the $l_{\infty}$ norm of the proposed hybrid FDMs for the elliptic interface problem.
Markov chain Monte Carlo (MCMC) algorithms have played a significant role in statistics, physics, machine learning and others, and they are the only known general and efficient approach for some high-dimensional problems. The random walk Metropolis (RWM) algorithm as the most classical MCMC algorithm, has had a great influence on the development and practice of science and engineering. The behavior of the RWM algorithm in high-dimensional problems is typically investigated through a weak convergence result of diffusion processes. In this paper, we utilize the Mosco convergence of Dirichlet forms in analyzing the RWM algorithm on large graphs, whose target distribution is the Gibbs measure that includes any probability measure satisfying a Markov property. The abstract and powerful theory of Dirichlet forms allows us to work directly and naturally on the infinite-dimensional space, and our notion of Mosco convergence allows Dirichlet forms associated with the RWM chains to lie on changing Hilbert spaces. Through the optimal scaling problem, we demonstrate the impressive strengths of the Dirichlet form approach over the standard diffusion approach.
We give improved and almost optimal testers for several classes of Boolean functions on $n$ inputs that have concise representation in the uniform and distribution-free model. Classes, such as $k$-junta, $k$-linear functions, $s$-term DNF, $s$-term monotone DNF, $r$-DNF, decision list, $r$-decision list, size-$s$ decision tree, size-$s$ Boolean formula, size-$s$ branching programs, $s$-sparse polynomials over the binary field and function with Fourier degree at most $d$. The method can be extended to several other classes of functions over any domain that can be approximated by functions that have a small number of relevant variables.
Physics-based optical flow models have been successful in capturing the deformities in fluid motion arising from digital imagery. However, a common theoretical framework analyzing several physics-based models is missing. In this regard, we formulate a general framework for fluid motion estimation using a constraint-based refinement approach. We demonstrate that for a particular choice of constraint, our results closely approximate the classical continuity equation-based method for fluid flow. This closeness is theoretically justified by augmented Lagrangian method in a novel way. The convergence of Uzawa iterates is shown using a modified bounded constraint algorithm. The mathematical well-posedness is studied in a Hilbert space setting. Further, we observe a surprising connection to the Cauchy-Riemann operator that diagonalizes the system leading to a diffusive phenomenon involving the divergence and the curl of the flow. Several numerical experiments are performed and the results are shown on different datasets. Additionally, we demonstrate that a flow-driven refinement process involving the curl of the flow outperforms the classical physics-based optical flow method without any additional assumptions on the image data.
Prediction-oriented machine learning is becoming increasingly valuable to organizations, as it may drive applications in crucial business areas. However, decision-makers from companies across various industries are still largely reluctant to employ applications based on modern machine learning algorithms. We ascribe this issue to the widely held view on advanced machine learning algorithms as "black boxes" whose complexity does not allow for uncovering the factors that drive the output of a corresponding system. To contribute to overcome this adoption barrier, we argue that research in information systems should devote more attention to the design of prototypical prediction-oriented machine learning applications (i.e., artifacts) whose predictions can be explained to human decision-makers. However, despite the recent emergence of a variety of tools that facilitate the development of such artifacts, there has so far been little research on their development. We attribute this research gap to the lack of methodological guidance to support the creation of these artifacts. For this reason, we develop a methodology which unifies methodological knowledge from design science research and predictive analytics with state-of-the-art approaches to explainable artificial intelligence. Moreover, we showcase the methodology using the example of price prediction in the sharing economy (i.e., on Airbnb).