Recent extensive numerical experiments in high scale machine learning have allowed to uncover a quite counterintuitive phase transition, as a function of the ratio between the sample size and the number of parameters in the model. As the number of parameters $p$ approaches the sample size $n$, the generalisation error increases, but surprisingly, it starts decreasing again past the threshold $p=n$. This phenomenon, brought to the theoretical community attention in \cite{belkin2019reconciling}, has been thoroughly investigated lately, more specifically for simpler models than deep neural networks, such as the linear model when the parameter is taken to be the minimum norm solution to the least-squares problem, firstly in the asymptotic regime when $p$ and $n$ tend to infinity, see e.g. \cite{hastie2019surprises}, and recently in the finite dimensional regime and more specifically for linear models \cite{bartlett2020benign}, \cite{tsigler2020benign}, \cite{lecue2022geometrical}. In the present paper, we propose a finite sample analysis of non-linear models of \textit{ridge} type, where we investigate the \textit{overparametrised regime} of the double descent phenomenon for both the \textit{estimation problem} and the \textit{prediction} problem. Our results provide a precise analysis of the distance of the best estimator from the true parameter as well as a generalisation bound which complements recent works of \cite{bartlett2020benign} and \cite{chinot2020benign}. Our analysis is based on tools closely related to the continuous Newton method \cite{neuberger2007continuous} and a refined quantitative analysis of the performance in prediction of the minimum $\ell_2$-norm solution.
We provide an algorithm for computing an effective basis of homology of elliptic surfaces over the complex projective line on which integration of periods can be carried out. This allows the heuristic recovery of several algebraic invariants of the surface, notably the N\'eron-Severi lattice, the transcendental lattice, the Mordell-Weil group and the Mordell-Weil lattice. This algorithm comes with a SageMath implementation.
We consider a logic with truth values in the unit interval and which uses aggregation functions instead of quantifiers, and we describe a general approach to asymptotic elimination of aggregation functions and, indirectly, of asymptotic elimination of Mostowski style generalized quantifiers, since such can be expressed by using aggregation functions. The notion of ``local continuity'' of an aggregation function, which we make precise in two (related) ways, plays a central role in this approach.
A general theory of efficient estimation for ergodic diffusion processes sampled at high frequency with an infinite time horizon is presented. High frequency sampling is common in many applications, with finance as a prominent example. The theory is formulated in term of approximate martingale estimating functions and covers a large class of estimators including most of the previously proposed estimators for diffusion processes. Easily checked conditions ensuring that an estimating function is an approximate martingale are derived, and general conditions ensuring consistency and asymptotic normality of estimators are given. Most importantly, simple conditions are given that ensure rate optimality and efficiency. Rate optimal estimators of parameters in the diffusion coefficient converge faster than estimators of drift coefficient parameters because they take advantage of the information in the quadratic variation. The conditions facilitate the choice among the multitude of estimators that have been proposed for diffusion models. Optimal martingale estimating functions in the sense of Godambe and Heyde and their high frequency approximations are, under weak conditions, shown to satisfy the conditions for rate optimality and efficiency. This provides a natural feasible method of constructing explicit rate optimal and efficient estimating functions by solving a linear equation.
The present work provides a comprehensive study of symmetric-conjugate operator splitting methods in the context of linear parabolic problems and demonstrates their additional benefits compared to symmetric splitting methods. Relevant applications include nonreversible systems and ground state computations for linear Schr\"odinger equations based on the imaginary time propagation. Numerical examples confirm the favourable error behaviour of higher-order symmetric-conjugate splitting methods and illustrate the usefulness of a time stepsize control, where the local error estimation relies on the computation of the imaginary parts and thus requires negligible costs.
We address the numerical treatment of source terms in algebraic flux correction schemes for steady convection-diffusion-reaction (CDR) equations. The proposed algorithm constrains a continuous piecewise-linear finite element approximation using a monolithic convex limiting (MCL) strategy. Failure to discretize the convective derivatives and source terms in a compatible manner produces spurious ripples, e.g., in regions where the coefficients of the continuous problem are constant and the exact solution is linear. We cure this deficiency by incorporating source term components into the fluxes and intermediate states of the MCL procedure. The design of our new limiter is motivated by the desire to preserve simple steady-state equilibria exactly, as in well-balanced schemes for the shallow water equations. The results of our numerical experiments for two-dimensional CDR problems illustrate potential benefits of well-balanced flux limiting in the scalar case.
The theory of generalized locally Toeplitz (GLT) sequences is a powerful apparatus for computing the asymptotic spectral distribution of matrices $A_n$ arising from numerical discretizations of differential equations. Indeed, when the mesh fineness parameter $n$ tends to infinity, these matrices $A_n$ give rise to a sequence $\{A_n\}_n$, which often turns out to be a GLT sequence. In this paper, we extend the theory of GLT sequences in several directions: we show that every GLT sequence enjoys a normal form, we identify the spectral symbol of every GLT sequence formed by normal matrices, and we prove that, for every GLT sequence $\{A_n\}_n$ formed by normal matrices and every continuous function $f:\mathbb C\to\mathbb C$, the sequence $\{f(A_n)\}_n$ is again a GLT sequence whose spectral symbol is $f(\kappa)$, where $\kappa$ is the spectral symbol of $\{A_n\}_n$. In addition, using the theory of GLT sequences, we prove a spectral distribution result for perturbed normal matrices.
One tuple of probability vectors is more informative than another tuple when there exists a single stochastic matrix transforming the probability vectors of the first tuple into the probability vectors of the other. This is called matrix majorization. Solving an open problem raised by Mu et al, we show that if certain monotones - namely multivariate extensions of R\'{e}nyi divergences - are strictly ordered between the two tuples, then for sufficiently large $n$, there exists a stochastic matrix taking the $n$-fold Kronecker power of each input distribution to the $n$-fold Kronecker power of the corresponding output distribution. The same conditions, with non-strict ordering for the monotones, are also necessary for such matrix majorization in large samples. Our result also gives conditions for the existence of a sequence of statistical maps that asymptotically (with vanishing error) convert a single copy of each input distribution to the corresponding output distribution with the help of a catalyst that is returned unchanged. Allowing for transformation with arbitrarily small error, we find conditions that are both necessary and sufficient for such catalytic matrix majorization. We derive our results by building on a general algebraic theory of preordered semirings recently developed by one of the authors. This also allows us to recover various existing results on majorization in large samples and in the catalytic regime as well as relative majorization in a unified manner.
It is known that standard stochastic Galerkin methods encounter challenges when solving partial differential equations with high-dimensional random inputs, which are typically caused by the large number of stochastic basis functions required. It becomes crucial to properly choose effective basis functions, such that the dimension of the stochastic approximation space can be reduced. In this work, we focus on the stochastic Galerkin approximation associated with generalized polynomial chaos (gPC), and explore the gPC expansion based on the analysis of variance (ANOVA) decomposition. A concise form of the gPC expansion is presented for each component function of the ANOVA expansion, and an adaptive ANOVA procedure is proposed to construct the overall stochastic Galerkin system. Numerical results demonstrate the efficiency of our proposed adaptive ANOVA stochastic Galerkin method for both diffusion and Helmholtz problems.
In the present paper, we propose a block variant of the extended Hessenberg process for computing approximations of matrix functions and other problems producing large-scale matrices. Applications to the computation of a matrix function such as f(A)V, where A is an nxn large sparse matrix, V is an nxp block with p<<n, and f is a function are presented. Solving shifted linear systems with multiple right hand sides are also given. Computing approximations of these matrix problems appear in many scientific and engineering applications. Different numerical experiments are provided to show the effectiveness of the proposed method for these problems.
This work deals with an inverse source problem for the biharmonic wave equation. A two-stage numerical method is proposed to identify the unknown source from the multi-frequency phaseless data. In the first stage, we introduce some artificially auxiliary point sources to the inverse source system and establish a phase retrieval formula. Theoretically, we point out that the phase can be uniquely determined and estimate the stability of this phase retrieval approach. Once the phase information is retrieved, the Fourier method is adopted to reconstruct the source function from the phased multi-frequency data. The proposed method is easy-to-implement and there is no forward solver involved in the reconstruction. Numerical experiments are conducted to verify the performance of the proposed method.