We consider the question of estimating multi-dimensional Gaussian mixtures (GM) with compactly supported or subgaussian mixing distributions. Minimax estimation rate for this class (under Hellinger, TV and KL divergences) is a long-standing open question, even in one dimension. In this paper we characterize this rate (for all constant dimensions) in terms of the metric entropy of the class. Such characterizations originate from seminal works of Le Cam (1973); Birge (1983); Haussler and Opper (1997); Yang and Barron (1999). However, for GMs a key ingredient missing from earlier work (and widely sought-after) is a comparison result showing that the KL and the squared Hellinger distance are within a constant multiple of each other uniformly over the class. Our main technical contribution is in showing this fact, from which we derive entropy characterization for estimation rate under Hellinger and KL. Interestingly, the sequential (online learning) estimation rate is characterized by the global entropy, while the single-step (batch) rate corresponds to local entropy, paralleling a similar result for the Gaussian sequence model recently discovered by Neykov (2022) and Mourtada (2023). Additionally, since Hellinger is a proper metric, our comparison shows that GMs under KL satisfy the triangle inequality within multiplicative constants, implying that proper and improper estimation rates coincide.
Solving high-dimensional random parametric PDEs poses a challenging computational problem. It is well-known that numerical methods can greatly benefit from adaptive refinement algorithms, in particular when functional approximations in polynomials are computed as in stochastic Galerkin and stochastic collocations methods. This work investigates a residual based adaptive algorithm used to approximate the solution of the stationary diffusion equation with lognormal coefficients. It is known that the refinement procedure is reliable, but the theoretical convergence of the scheme for this class of unbounded coefficients remains a challenging open question. This paper advances the theoretical results by providing a quasi-error reduction results for the adaptive solution of the lognormal stationary diffusion problem. A computational example supports the theoretical statement.
The work of Kalman and Bucy has established a duality between filtering and optimal estimation in the context of time-continuous linear systems. This duality has recently been extended to time-continuous nonlinear systems in terms of an optimization problem constrained by a backward stochastic partial differential equation. Here we revisit this problem from the perspective of appropriate forward-backward stochastic differential equations. This approach sheds new light on the estimation problem and provides a unifying perspective. It is also demonstrated that certain formulations of the estimation problem lead to deterministic formulations similar to the linear Gaussian case as originally investigated by Kalman and Bucy. Finally, optimal control of partially observed diffusion processes is discussed as an application of the proposed estimators.
The dual consistency is an important issue in developing stable DWR error estimation towards the goal-oriented mesh adaptivity. In this paper, such an issue is studied in depth based on a Newton-GMG framework for the steady Euler equations. Theoretically, the numerical framework is redescribed using the Petrov-Galerkin scheme, based on which the dual consistency is depicted. A boundary modification technique is discussed for preserving the dual consistency within the Newton-GMG framework. Numerically, a geometrical multigrid is proposed for solving the dual problem, and a regularization term is designed to guarantee the convergence of the iteration. The following features of our method can be observed from numerical experiments, i). a stable numerical convergence of the quantity of interest can be obtained smoothly for problems with different configurations, and ii). towards accurate calculation of quantity of interest, mesh grids can be saved significantly using the proposed dual-consistent DWR method, compared with the dual-inconsistent one.
We present the full approximation scheme constraint decomposition (FASCD) multilevel method for solving variational inequalities (VIs). FASCD is a common extension of both the full approximation scheme (FAS) multigrid technique for nonlinear partial differential equations, due to A.~Brandt, and the constraint decomposition (CD) method introduced by X.-C.~Tai for VIs arising in optimization. We extend the CD idea by exploiting the telescoping nature of certain function space subset decompositions arising from multilevel mesh hierarchies. When a reduced-space (active set) Newton method is applied as a smoother, with work proportional to the number of unknowns on a given mesh level, FASCD V-cycles exhibit nearly mesh-independent convergence rates, and full multigrid cycles are optimal solvers. The example problems include differential operators which are symmetric linear, nonsymmetric linear, and nonlinear, in unilateral and bilateral VI problems.
We propose a Hermite spectral method for the inelastic Boltzmann equation, which makes two-dimensional periodic problem computation affordable by the hardware nowadays. The new algorithm is based on a Hermite expansion, where the expansion coefficients for the VHS model are reduced into several summations and can be derived exactly. Moreover, a new collision model is built with a combination of the quadratic collision operator and a linearized collision operator, which helps us to balance the computational cost and the accuracy. Various numerical experiments, including spatially two-dimensional simulations, demonstrate the accuracy and efficiency of this numerical scheme.
We propose a distributed bundle adjustment (DBA) method using the exact Levenberg-Marquardt (LM) algorithm for super large-scale datasets. Most of the existing methods partition the global map to small ones and conduct bundle adjustment in the submaps. In order to fit the parallel framework, they use approximate solutions instead of the LM algorithm. However, those methods often give sub-optimal results. Different from them, we utilize the exact LM algorithm to conduct global bundle adjustment where the formation of the reduced camera system (RCS) is actually parallelized and executed in a distributed way. To store the large RCS, we compress it with a block-based sparse matrix compression format (BSMC), which fully exploits its block feature. The BSMC format also enables the distributed storage and updating of the global RCS. The proposed method is extensively evaluated and compared with the state-of-the-art pipelines using both synthetic and real datasets. Preliminary results demonstrate the efficient memory usage and vast scalability of the proposed method compared with the baselines. For the first time, we conducted parallel bundle adjustment using LM algorithm on a real datasets with 1.18 million images and a synthetic dataset with 10 million images (about 500 times that of the state-of-the-art LM-based BA) on a distributed computing system.
We propose a matrix-free parallel two-level-deflation preconditioner combined with the Complex Shifted Laplacian preconditioner(CSLP) for the two-dimensional Helmholtz problems. The Helmholtz equation is widely studied in seismic exploration, antennas, and medical imaging. It is one of the hardest problems to solve both in terms of accuracy and convergence, due to scalability issues of the numerical solvers. Motivated by the observation that for large wavenumbers, the eigenvalues of the CSLP-preconditioned system shift towards zero, deflation with multigrid vectors, and further high-order vectors were incorporated to obtain wave-number-independent convergence. For large-scale applications, high-performance parallel scalable methods are also indispensable. In our method, we consider the preconditioned Krylov subspace methods for solving the linear system obtained from finite-difference discretization. The CSLP preconditioner is approximated by one parallel geometric multigrid V-cycle. For the two-level deflation, the matrix-free Galerkin coarsening as well as high-order re-discretization approaches on the coarse grid are studied. The results of matrix-vector multiplications in Krylov subspace methods and the interpolation/restriction operators are implemented based on the finite-difference grids without constructing any coefficient matrix. These adjustments lead to direct improvements in terms of memory consumption. Numerical experiments of model problems show that wavenumber independence has been obtained for medium wavenumbers. The matrix-free parallel framework shows satisfactory weak and strong parallel scalability.
The purpose of the paper is to provide a characterization of the error of the best polynomial approximation of composite functions in weighted spaces. Such a characterization is essential for the convergence analysis of numerical methods applied to non-linear problems or for numerical approaches that make use of regularization techniques to cure low smoothness of the solution. This result is obtained through an estimate of the derivatives of composite functions in weighted uniform norm.
Characterizing shapes of high-dimensional objects via Ricci curvatures plays a critical role in many research areas in mathematics and physics. However, even though several discretizations of Ricci curvatures for discrete combinatorial objects such as networks have been proposed and studied by mathematicians, the computational complexity aspects of these discretizations have escaped the attention of theoretical computer scientists to a large extent. In this paper, we study one such discretization, namely the Ollivier-Ricci curvature, from the perspective of efficient computation by fine-grained reductions and local query-based algorithms. Our main contributions are the following. (a) We relate our curvature computation problem to minimum weight perfect matching problem on complete bipartite graphs via fine-grained reduction. (b) We formalize the computational aspects of the curvature computation problems in suitable frameworks so that they can be studied by researchers in local algorithms. (c) We provide the first known lower and upper bounds on queries for query-based algorithms for the curvature computation problems in our local algorithms framework. En route, we also illustrate a localized version of our fine-grained reduction. We believe that our results bring forth an intriguing set of research questions, motivated both in theory and practice, regarding designing efficient algorithms for curvatures of objects.
We present a robust deep incremental learning framework for regression tasks on financial temporal tabular datasets which is built upon the incremental use of commonly available tabular and time series prediction models to adapt to distributional shifts typical of financial datasets. The framework uses a simple basic building block (decision trees) to build self-similar models of any required complexity to deliver robust performance under adverse situations such as regime changes, fat-tailed distributions, and low signal-to-noise ratios. As a detailed study, we demonstrate our scheme using XGBoost models trained on the Numerai dataset and show that a two layer deep ensemble of XGBoost models over different model snapshots delivers high quality predictions under different market regimes. We also show that the performance of XGBoost models with different number of boosting rounds in three scenarios (small, standard and large) is monotonically increasing with respect to model size and converges towards the generalisation upper bound. We also evaluate the robustness of the model under variability of different hyperparameters, such as model complexity and data sampling settings. Our model has low hardware requirements as no specialised neural architectures are used and each base model can be independently trained in parallel.