In this article, we provide stability estimates for the finite element discretization of a class of inverse parameter problems of the form $-\nabla\cdot(\mu S) = \g f$ in a domain $\Omega$ of $\R^d$. Here $\mu$ is the unknown parameter to recover, the matrix valued function $S$ and the vector valued distribution $\g f$ are known. As uniqueness is not guaranteed in general for this problem, we prove a Lipschitz-type stability estimate in an hyperplane of $L^2(\Omega)$. This stability is obtained through an adaptation of the so-called discrete \emph{inf-sup} constant or LBB constant to a large class of first-order differential operators. We then provide a simple and original discretization based on hexagonal finite element that satisfies the discrete stability condition and shows corresponding numerical reconstructions. The obtained algebraic inversion method is efficient as it does not require any iterative solving of the forward problem and is very general as it does not require any smoothness hypothesis for the data nor any additional information at the boundary.
A Lattice is a partially ordered set where both least upper bound and greatest lower bound of any pair of elements are unique and exist within the set. K\"{o}tter and Kschischang proved that codes in the linear lattice can be used for error and erasure-correction in random networks. Codes in the linear lattice have previously been shown to be special cases of codes in modular lattices. Two well known classifications of semimodular lattices are geometric and distributive lattices. Most of the frequently used coding spaces are examples of either or both. We have identified the unique criterion which makes a geometric lattice distributive, thus characterizing all finite geometric distributive lattices. Our characterization helps to prove a conjecture regarding the maximum size of a distributive sublattice of a finite geometric lattice and identify the maximal case. The Whitney numbers of the class of geometric distributive lattices are also calculated. We present a few other applications of this unique characterization to derive certain results regarding linearity and complements in the linear lattice.
We investigate the stability of quasi-stationary distributions of killed Markov processes to perturbations of the generator. In the first setting, we consider a general bounded self-adjoint perturbation operator, and after that, study a particular unbounded perturbation corresponding to the truncation of the killing rate. In both scenarios, we quantify the difference between eigenfunctions of the smallest eigenvalue of the perturbed and unperturbed generator in a Hilbert space norm. As a consequence, $\mathcal{L}^1$-norm estimates of the difference of the resulting quasi-stationary distributions in terms of the perturbation are provided. These results are particularly relevant to the recently-proposed class of quasi-stationary Monte Carlo methods, designed for scalable exact Bayesian inference.
We introduce the multivariate decomposition finite element method for elliptic PDEs with lognormal diffusion coefficient $a=\exp(Z)$ where $Z$ is a Gaussian random field defined by an infinite series expansion $Z(\boldsymbol{y}) = \sum_{j\ge1} y_j\,\phi_j$ with $y_j\sim\mathcal{N}(0,1)$ and a given sequence of functions $\{\phi_j\}_{j\ge1}$. We use the MDFEM to approximate the expected value of a linear functional of the solution of the PDE which is an infinite-dimensional integral over the parameter space. The proposed algorithm uses the multivariate decomposition method (MDM) to compute the infinite-dimensional integral by a decomposition into finite-dimensional integrals, which we resolve using quasi-Monte Carlo (QMC) methods, and for which we use the finite element method (FEM) to solve different instances of the PDE. We develop higher-order quasi-Monte Carlo rules for integration over the finite-dimensional Euclidean space with respect to the Gaussian distribution by use of a truncation strategy. By linear transformations of interlaced polynomial lattice rules from the unit cube to a multivariate box of the Euclidean space we achieve higher-order convergence rates for functions belonging to a class of anchored Gaussian Sobolev spaces, taking into account the truncation error. Under appropriate conditions, the MDFEM achieves higher-order convergence rates in term of error versus cost, i.e., to achieve an accuracy of $O(\epsilon)$ the computational cost is $O(\epsilon^{-1/\lambda-d'/\lambda}) = O(\epsilon^{-(p^*+d'/\tau)/(1-p^*)})$ where $\epsilon^{-1/\lambda}$ and $\epsilon^{-d'/\lambda}$ are respectively the cost of the quasi-Monte Carlo cubature and the finite element approximations, with $d' = d \, (1+\delta')$ for some $\delta' \ge 0$ and $d$ the physical dimension, and $0 < p^* \le (2+d'/\tau)^{-1}$ is a parameter representing the sparsity of $\{\phi_j\}_{j\ge1}$.
We consider a stochastic version of the proximal point algorithm for optimization problems posed on a Hilbert space. A typical application of this is supervised learning. While the method is not new, it has not been extensively analyzed in this form. Indeed, most related results are confined to the finite-dimensional setting, where error bounds could depend on the dimension of the space. On the other hand, the few existing results in the infinite-dimensional setting only prove very weak types of convergence, owing to weak assumptions on the problem. In particular, there are no results that show convergence with a rate. In this article, we bridge these two worlds by assuming more regularity of the optimization problem, which allows us to prove convergence with an (optimal) sub-linear rate also in an infinite-dimensional setting. In particular, we assume that the objective function is the expected value of a family of convex differentiable functions. While we require that the full objective function is strongly convex, we do not assume that its constituent parts are so. Further, we require that the gradient satisfies a weak local Lipschitz continuity property, where the Lipschitz constant may grow polynomially given certain guarantees on the variance and higher moments near the minimum. We illustrate these results by discretizing a concrete infinite-dimensional classification problem with varying degrees of accuracy.
In this paper, we propose a multirate iterative scheme with multiphysics finite element method for a fluid-saturated poroelasticity model. Firstly, we reformulate the original model into a fluid coupled problem to apply the multiphysics finite element method for the discretization of the space variables, and we design a multirate iterative scheme on the time scale which solve a generalized Stokes problem in the coarse time size and solve the diffusion problem in the finer time size according to the characteristics of the poroelasticity problem. Secondly, we prove that the multirate iterative scheme is stable and the numerical solution satisfies some energy conservation laws, which are important to ensure the uniqueness of solution to the decoupled computing problem. Also, we analyze the error estimates to prove that the proposed numerical method doesn't reduce the precision of numerical solution and greatly reduces the computational cost. Finally, we give the numerical tests to verify the theoretical results and draw a conclusion to summary the main results in this paper.
For optimal control problems constrained by a initial-valued parabolic PDE, we have to solve a large scale saddle point algebraic system consisting of considering the discrete space and time points all together. A popular strategy to handle such a system is the Krylov subspace method, for which an efficient preconditioner plays a crucial role. The matching-Schur-complement preconditioner has been extensively studied in literature and the implementation of this preconditioner lies in solving the underlying PDEs twice, sequentially in time. In this paper, we propose a new preconditioner for the Schur complement, which can be used parallel-in-time (PinT) via the so called diagonalization technique. We show that the eigenvalues of the preconditioned matrix are low and upper bounded by positive constants independent of matrix size and the regularization parameter. The uniform boundedness of the eigenvalues leads to an optimal linear convergence rate of conjugate gradient solver for the preconditioned Schur complement system. To the best of our knowledge, it is the first time to have an optimal convergence analysis for a PinT preconditioning technique of the optimal control problem. Numerical results are reported to show that the performance of the proposed preconditioner is robust with respect to the discretization step-sizes and the regularization parameter.
Under some regularity assumptions, we report an a priori error analysis of a dG scheme for the Poisson and Stokes flow problem in their dual mixed formulation. Both formulations satisfy a Babu\v{s}ka-Brezzi type condition within the space H(div) x L2. It is well known that the lowest order Crouzeix-Raviart element paired with piecewise constants satisfies such a condition on (broken) H1 x L2 spaces. In the present article, we use this pair. The continuity of the normal component is weakly imposed by penalizing jumps of the broken H(div) component. For the resulting methods, we prove well-posedness and convergence with constants independent of data and mesh size. We report error estimates in the methods natural norms and optimal local error estimates for the divergence error. In fact, our finite element solution shares for each triangle one DOF with the CR interpolant and the divergence is locally the best-approximation for any regularity. Numerical experiments support the findings and suggest that the other errors converge optimally even for the lowest regularity solutions and a crack-problem, as long as the crack is resolved by the mesh.
In the Bayesian reinforcement learning (RL) setting, a prior distribution over the unknown problem parameters -- the rewards and transitions -- is assumed, and a policy that optimizes the (posterior) expected return is sought. A common approximation, which has been recently popularized as meta-RL, is to train the agent on a sample of $N$ problem instances from the prior, with the hope that for large enough $N$, good generalization behavior to an unseen test instance will be obtained. In this work, we study generalization in Bayesian RL under the probably approximately correct (PAC) framework, using the method of algorithmic stability. Our main contribution is showing that by adding regularization, the optimal policy becomes stable in an appropriate sense. Most stability results in the literature build on strong convexity of the regularized loss -- an approach that is not suitable for RL as Markov decision processes (MDPs) are not convex. Instead, building on recent results of fast convergence rates for mirror descent in regularized MDPs, we show that regularized MDPs satisfy a certain quadratic growth criterion, which is sufficient to establish stability. This result, which may be of independent interest, allows us to study the effect of regularization on generalization in the Bayesian RL setting.
Mixtures of product distributions are a powerful device for learning about heterogeneity within data populations. In this class of latent structure models, de Finetti's mixing measure plays the central role for describing the uncertainty about the latent parameters representing heterogeneity. In this paper posterior contraction theorems for de Finetti's mixing measure arising from finite mixtures of product distributions will be established, under the setting the number of exchangeable sequences of observed variables increases while sequence length(s) may be either fixed or varied. The role of both the number of sequences and the sequence lengths will be carefully examined. In order to obtain concrete rates of convergence, a first-order identifiability theory for finite mixture models and a family of sharp inverse bounds for mixtures of product distributions will be developed via a harmonic analysis of such latent structure models. This theory is applicable to broad classes of probability kernels composing the mixture model of product distributions for both continuous and discrete domain $\mathfrak{X}$. Examples of interest include the case the probability kernel is only weakly identifiable in the sense of Ho and Nguyen (2016), the case where the kernel is itself a mixture distribution as in hierarchical models, and the case the kernel may not have a density with respect to a dominating measure on an abstract domain $\mathfrak{X}$ such as Dirichlet processes.
While Generative Adversarial Networks (GANs) have empirically produced impressive results on learning complex real-world distributions, recent work has shown that they suffer from lack of diversity or mode collapse. The theoretical work of Arora et al.~\cite{AroraGeLiMaZh17} suggests a dilemma about GANs' statistical properties: powerful discriminators cause overfitting, whereas weak discriminators cannot detect mode collapse. In contrast, we show in this paper that GANs can in principle learn distributions in Wasserstein distance (or KL-divergence in many cases) with polynomial sample complexity, if the discriminator class has strong distinguishing power against the particular generator class (instead of against all possible generators). For various generator classes such as mixture of Gaussians, exponential families, and invertible neural networks generators, we design corresponding discriminators (which are often neural nets of specific architectures) such that the Integral Probability Metric (IPM) induced by the discriminators can provably approximate the Wasserstein distance and/or KL-divergence. This implies that if the training is successful, then the learned distribution is close to the true distribution in Wasserstein distance or KL divergence, and thus cannot drop modes. Our preliminary experiments show that on synthetic datasets the test IPM is well correlated with KL divergence, indicating that the lack of diversity may be caused by the sub-optimality in optimization instead of statistical inefficiency.