We propose to approximate a (possibly discontinuous) multivariate function f (x) on a compact set by the partial minimizer arg miny p(x, y) of an appropriate polynomial p whose construction can be cast in a univariate sum of squares (SOS) framework, resulting in a highly structured convex semidefinite program. In a number of non-trivial cases (e.g. when f is a piecewise polynomial) we prove that the approximation is exact with a low-degree polynomial p. Our approach has three distinguishing features: (i) It is mesh-free and does not require the knowledge of the discontinuity locations. (ii) It is model-free in the sense that we only assume that the function to be approximated is available through samples (point evaluations). (iii) The size of the semidefinite program is independent of the ambient dimension and depends linearly on the number of samples. We also analyze the sample complexity of the approach, proving a generalization error bound in a probabilistic setting. This allows for a comparison with machine learning approaches.
Functions with singularities are notoriously difficult to approximate with conventional approximation schemes. In computational applications they are often resolved with low-order piecewise polynomials, multilevel schemes or other types of grading strategies. Rational functions are an exception to this rule: for univariate functions with point singularities, such as branch points, rational approximations exist with root-exponential convergence in the rational degree. This is typically enabled by the clustering of poles near the singularity. Both the theory and computational practice of rational functions for function approximation have focused on the univariate case, with extensions to two dimensions via identification with the complex plane. Multivariate rational functions, i.e., quotients of polynomials of several variables, are relatively unexplored in comparison. Yet, apart from a steep increase in theoretical complexity, they also offer a wealth of opportunities. A first observation is that singularities of multivariate rational functions may be continuous curves of poles, rather than isolated ones. By generalizing the clustering of poles from points to curves, we explore constructions of multivariate rational approximations to functions with curves of singularities.
While generalized linear mixed models (GLMMs) are a fundamental tool in applied statistics, many specifications -- such as those involving categorical factors with many levels or interaction terms -- can be computationally challenging to estimate due to the need to compute or approximate high-dimensional integrals. Variational inference (VI) methods are a popular way to perform such computations, especially in the Bayesian context. However, naive VI methods can provide unreliable uncertainty quantification. We show that this is indeed the case in the GLMM context, proving that standard VI (i.e. mean-field) dramatically underestimates posterior uncertainty in high-dimensions. We then show how appropriately relaxing the mean-field assumption leads to VI methods whose uncertainty quantification does not deteriorate in high-dimensions, and whose total computational cost scales linearly with the number of parameters and observations. Our theoretical and numerical results focus on GLMMs with Gaussian or binomial likelihoods, and rely on connections to random graph theory to obtain sharp high-dimensional asymptotic analysis. We also provide generic results, which are of independent interest, relating the accuracy of variational inference to the convergence rate of the corresponding coordinate ascent variational inference (CAVI) algorithm for Gaussian targets. Our proposed partially-factorized VI (PF-VI) methodology for GLMMs is implemented in the R package vglmer, see //github.com/mgoplerud/vglmer . Numerical results with simulated and real data examples illustrate the favourable computation cost versus accuracy trade-off of PF-VI.
We propose, analyze, and test new iterative solvers for large-scale systems of linear algebraic equations arising from the finite element discretization of reduced optimality systems defining the finite element approximations to the solution of elliptic tracking-type distributed optimal control problems with both the standard $L_2$ and the more general energy regularizations. If we aim at an approximation of the given desired state $y_d$ by the computed finite element state $y_h$ that asymptotically differs from $y_d$ in the order of the best $L_2$ approximation under acceptable costs for the control, then the optimal choice of the regularization parameter $\varrho$ is linked to the mesh-size $h$ by the relations $\varrho=h^4$ and $\varrho=h^2$ for the $L_2$ and the energy regularization, respectively. For this setting, we can construct efficient parallel iterative solvers for the reduced finite element optimality systems. These results can be generalized to variable regularization parameters adapted to the local behavior of the mesh-size that can heavily change in case of adaptive mesh refinement. Similar results can be obtained for the space-time finite element discretization of the corresponding parabolic and hyperbolic optimal control problems.
We derive and analyze a symmetric interior penalty discontinuous Galerkin scheme for the approximation of the second-order form of the radiative transfer equation in slab geometry. Using appropriate trace lemmas, the analysis can be carried out as for more standard elliptic problems. Supporting examples show the accuracy and stability of the method also numerically, for different polynomial degrees. For discretization, we employ quad-tree grids, which allow for local refinement in phase-space, and we show exemplary that adaptive methods can efficiently approximate discontinuous solutions. We investigate the behavior of hierarchical error estimators and error estimators based on local averaging.
Partial differential equations (PDEs) with uncertain or random inputs have been considered in many studies of uncertainty quantification. In forward uncertainty quantification, one is interested in analyzing the stochastic response of the PDE subject to input uncertainty, which usually involves solving high-dimensional integrals of the PDE output over a sequence of stochastic variables. In practical computations, one typically needs to discretize the problem in several ways: approximating an infinite-dimensional input random field with a finite-dimensional random field, spatial discretization of the PDE using, e.g., finite elements, and approximating high-dimensional integrals using cubatures such as quasi-Monte Carlo methods. In this paper, we focus on the error resulting from dimension truncation of an input random field. We show how Taylor series can be used to derive theoretical dimension truncation rates for a wide class of problems and we provide a simple checklist of conditions that a parametric mathematical model needs to satisfy in order for our dimension truncation error bound to hold. Some of the novel features of our approach include that our results are applicable to non-affine parametric operator equations, dimensionally-truncated conforming finite element discretized solutions of parametric PDEs, and even compositions of PDE solutions with smooth nonlinear quantities of interest. As a specific application of our method, we derive an improved dimension truncation error bound for elliptic PDEs with lognormally parameterized diffusion coefficients. Numerical examples support our theoretical findings.
We revisit the general framework introduced by Fazylab et al. (SIAM J. Optim. 28, 2018) to construct Lyapunov functions for optimization algorithms in discrete and continuous time. For smooth, strongly convex objective functions, we relax the requirements necessary for such a construction. As a result we are able to prove for Polyak's ordinary differential equations and for a two-parameter family of Nesterov algorithms rates of convergence that improve on those available in the literature. We analyse the interpretation of Nesterov algorithms as discretizations of the Polyak equation. We show that the algorithms are instances of Additive Runge-Kutta integrators and discuss the reasons why most discretizations of the differential equation do not result in optimization algorithms with acceleration. We also introduce a modification of Polyak's equation and study its convergence properties. Finally we extend the general framework to the stochastic scenario and consider an application to random algorithms with acceleration for overparameterized models; again we are able to prove convergence rates that improve on those in the literature.
This work puts forth low-complexity Riemannian subspace descent algorithms for the minimization of functions over the symmetric positive definite (SPD) manifold. Different from the existing Riemannian gradient descent variants, the proposed approach utilizes carefully chosen subspaces that allow the update to be written as a product of the Cholesky factor of the iterate and a sparse matrix. The resulting updates avoid the costly matrix operations like matrix exponentiation and dense matrix multiplication, which are generally required in almost all other Riemannian optimization algorithms on SPD manifold. We further identify a broad class of functions, arising in diverse applications, such as kernel matrix learning, covariance estimation of Gaussian distributions, maximum likelihood parameter estimation of elliptically contoured distributions, and parameter estimation in Gaussian mixture model problems, over which the Riemannian gradients can be calculated efficiently. The proposed uni-directional and multi-directional Riemannian subspace descent variants incur per-iteration complexities of $O(n)$ and $O(n^2)$ respectively, as compared to the $O(n^3)$ or higher complexity incurred by all existing Riemannian gradient descent variants. The superior runtime and low per-iteration complexity of the proposed algorithms is also demonstrated via numerical tests on large-scale covariance estimation and matrix square root problems. MATLAB code implementation is publicly available on GitHub : //github.com/yogeshd-iitk/subspace_descent_over_SPD_manifold
By incorporating a new matrix splitting and the momentum acceleration into the relaxed-based matrix splitting (RMS) method \cite{soso2023}, a generalization of the RMS (GRMS) iterative method for solving the generalized absolute value equations (GAVEs) is proposed. Unlike some existing methods, by using the Cauchy's convergence principle, we give some sufficient conditions for the existence and uniqueness of the solution to the GAVEs and prove that our method can converge to the unique solution of the GAVEs. Moreover, we can obtain a few new and weaker convergence conditions for some existing methods. Preliminary numerical experiments show that the proposed method is efficient.
In this paper, a two-sided variable-coefficient space-fractional diffusion equation with fractional Neumann boundary condition is considered. To conquer the weak singularity caused by the nonlocal space-fractional differential operators, by introducing an auxiliary fractional flux variable and using piecewise linear interpolations, a fractional block-centered finite difference (BCFD) method on general nonuniform grids is proposed. However, like other numerical methods, the proposed method still produces linear algebraic systems with unstructured dense coefficient matrices under the general nonuniform grids.Consequently, traditional direct solvers such as Gaussian elimination method shall require $\mathcal{O}(M^2)$ memory and $\mathcal{O}(M^3)$ computational work per time level, where $M$ is the number of spatial unknowns in the numerical discretization. To address this issue, we combine the well-known sum-of-exponentials (SOE) approximation technique with the fractional BCFD method to propose a fast version fractional BCFD algorithm. Based upon the Krylov subspace iterative methods, fast matrix-vector multiplications of the resulting coefficient matrices with any vector are developed, in which they can be implemented in only $\mathcal{O}(MN_{exp})$ operations per iteration, where $N_{exp}\ll M$ is the number of exponentials in the SOE approximation. Moreover, the coefficient matrices do not necessarily need to be generated explicitly, while they can be stored in $\mathcal{O}(MN_{exp})$ memory by only storing some coefficient vectors. Numerical experiments are provided to demonstrate the efficiency and accuracy of the method.
In this paper, a new two-relaxation-time regularized (TRT-R) lattice Boltzmann (LB) model for convection-diffusion equation (CDE) with variable coefficients is proposed. Within this framework, we first derive a TRT-R collision operator by constructing a new regularized procedure through the high-order Hermite expansion of non-equilibrium. Then a first-order discrete-velocity form of discrete source term is introduced to improve the accuracy of the source term. Finally and most importantly, a new first-order space-derivative auxiliary term is proposed to recover the correct CDE with variable coefficients. To evaluate this model, we simulate a classic benchmark problem of the rotating Gaussian pulse. The results show that our model has better accuracy, stability and convergence than other popular LB models, especially in the case of a large time step.