Convergence rates for $L_2$ approximation in a Hilbert space $H$ are a central theme in numerical analysis. The present work is inspired by Schaback (Math. Comp., 1999), who showed, in the context of best pointwise approximation for radial basis function interpolation, that the convergence rate for sufficiently smooth functions can be doubled, compared to the general rate for functions in the "native space" $H$. Motivated by this, we obtain a general result for $H$-orthogonal projection onto a finite dimensional subspace of $H$: namely, that any known $L_2$ convergence rate for all functions in $H$ translates into a doubled $L_2$ convergence rate for functions in a smoother normed space $B$, along with a similarly improved error bound in the $H$-norm, provided that $L_2$, $H$ and $B$ are suitably related. As a special case we improve the known $L_2$ and $H$-norm convergence rates for kernel interpolation in reproducing kernel Hilbert spaces, with particular attention to a recent study (Kaarnioja, Kazashi, Kuo, Nobile, Sloan, Numer. Math., 2022) of periodic kernel-based interpolation at lattice points applied to parametric partial differential equations. A second application is to radial basis function interpolation for general conditionally positive definite basis functions, where again the $L_2$ convergence rate is doubled, and the convergence rate in the native space norm is similarly improved, for all functions in a smoother normed space $B$.
In a recent work Li, Qiao, Wigderson, Wigderson and Zhang introduced notions of quantum expansion based on $S_p$ norms and posed as an open question if they were all equivalent. We give an affirmative answer to this question.
This paper analyzes a full discretization of a three-dimensional stochastic Allen-Cahn equation with multiplicative noise. The discretization uses the Euler scheme for temporal discretization and the finite element method for spatial discretization. By deriving a stability estimate of a discrete stochastic convolution and utilizing this stability estimate along with the discrete stochastic maximal $L^p$-regularity estimate, a pathwise uniform convergence rate with the general spatial $ L^q $-norms is derived.
We propose an algorithm to construct optimal exact designs (EDs). Most of the work in the optimal regression design literature focuses on the approximate design (AD) paradigm due to its desired properties, including the optimality verification conditions derived by Kiefer (1959, 1974). ADs may have unbalanced weights, and practitioners may have difficulty implementing them with a designated run size $n$. Some EDs are constructed using rounding methods to get an integer number of runs at each support point of an AD, but this approach may not yield optimal results. To construct EDs, one may need to perform new combinatorial constructions for each $n$, and there is no unified approach to construct them. Therefore, we develop a systematic way to construct EDs for any given $n$. Our method can transform ADs into EDs while retaining high statistical efficiency in two steps. The first step involves constructing an AD by utilizing the convex nature of many design criteria. The second step employs a simulated annealing algorithm to search for the ED stochastically. Through several applications, we demonstrate the utility of our method for various design problems. Additionally, we show that the design efficiency approaches unity as the number of design points increases.
Distillation is the task of replacing a complicated machine learning model with a simpler model that approximates the original [BCNM06,HVD15]. Despite many practical applications, basic questions about the extent to which models can be distilled, and the runtime and amount of data needed to distill, remain largely open. To study these questions, we initiate a general theory of distillation, defining PAC-distillation in an analogous way to PAC-learning [Val84]. As applications of this theory: (1) we propose new algorithms to extract the knowledge stored in the trained weights of neural networks -- we show how to efficiently distill neural networks into succinct, explicit decision tree representations when possible by using the ``linear representation hypothesis''; and (2) we prove that distillation can be much cheaper than learning from scratch, and make progress on characterizing its complexity.
We show that the $n$'th digit of the base-$b$ representation of the golden ratio is a finite-state function of the Zeckendorf representation of $b^n$, and hence can be computed by a finite automaton. Similar results can be proven for any quadratic irrational. We use a satisfiability (SAT) solver to prove, in some cases, that the automata we construct are minimal.
We present an optimal rate convergence analysis for a second order accurate in time, fully discrete finite difference scheme for the Cahn-Hilliard-Navier-Stokes (CHNS) system, combined with logarithmic Flory-Huggins energy potential. The numerical scheme has been recently proposed, and the positivity-preserving property of the logarithmic arguments, as well as the total energy stability, have been theoretically justified. In this paper, we rigorously prove second order convergence of the proposed numerical scheme, in both time and space. Since the CHNS is a coupled system, the standard $\ell^\infty (0, T; \ell^2) \cap \ell^2 (0, T; H_h^2)$ error estimate could not be easily derived, due to the lack of regularity to control the numerical error associated with the coupled terms. Instead, the $\ell^\infty (0, T; H_h^1) \cap \ell^2 (0, T; H_h^3)$ error analysis for the phase variable and the $\ell^\infty (0, T; \ell^2)$ analysis for the velocity vector, which shares the same regularity as the energy estimate, is more suitable to pass through the nonlinear analysis for the error terms associated with the coupled physical process. Furthermore, the highly nonlinear and singular nature of the logarithmic error terms makes the convergence analysis even more challenging, since a uniform distance between the numerical solution and the singular limit values of is needed for the associated error estimate. Many highly non-standard estimates, such as a higher order asymptotic expansion of the numerical solution (up to the third order accuracy in time and fourth order in space), combined with a rough error estimate (to establish the maximum norm bound for the phase variable), as well as a refined error estimate, have to be carried out to conclude the desired convergence result.
We consider the computation of statistical moments to operator equations with stochastic data. We remark that application of PINNs -- referred to as TPINNs -- allows to solve the induced tensor operator equations under minimal changes of existing PINNs code, and enabling handling of non-linear and time-dependent operators. We propose two types of architectures, referred to as vanilla and multi-output TPINNs, and investigate their benefits and limitations. Exhaustive numerical experiments are performed; demonstrating applicability and performance; raising a variety of new promising research avenues.
In this work, we study and extend a class of semi-Lagrangian exponential methods, which combine exponential time integration techniques, suitable for integrating stiff linear terms, with a semi-Lagrangian treatment of nonlinear advection terms. Partial differential equations involving both processes arise for instance in atmospheric circulation models. Through a truncation error analysis, we first show that previously formulated semi-Lagrangian exponential schemes are limited to first-order accuracy due to the discretization of the linear term; we then formulate a new discretization leading to a second-order accurate method. Also, a detailed stability study, both considering a linear stability analysis and an empirical simulation-based one, is conducted to compare several Eulerian and semi-Lagrangian exponential schemes, as well as a well-established semi-Lagrangian semi-implicit method, which is used in operational atmospheric models. Numerical simulations of the shallow-water equations on the rotating sphere, considering standard and challenging benchmark test cases, are performed to assess the orders of convergence, stability properties, and computational cost of each method. The proposed second-order semi-Lagrangian exponential method was shown to be more stable and accurate than the previously formulated schemes of the same class at the expense of larger wall-clock times; however, the method is more stable and has a similar cost compared to the well-established semi-Lagrangian semi-implicit; therefore, it is a competitive candidate for potential operational applications in atmospheric circulation modeling.
Optimization over the set of matrices that satisfy $X^\top B X = I_p$, referred to as the generalized Stiefel manifold, appears in many applications involving sampled covariance matrices such as canonical correlation analysis (CCA), independent component analysis (ICA), and the generalized eigenvalue problem (GEVP). Solving these problems is typically done by iterative methods, such as Riemannian approaches, which require a computationally expensive eigenvalue decomposition involving fully formed $B$. We propose a cheap stochastic iterative method that solves the optimization problem while having access only to a random estimate of the feasible set. Our method does not enforce the constraint in every iteration exactly, but instead it produces iterations that converge to a critical point on the generalized Stiefel manifold defined in expectation. The method has lower per-iteration cost, requires only matrix multiplications, and has the same convergence rates as its Riemannian counterparts involving the full matrix $B$. Experiments demonstrate its effectiveness in various machine learning applications involving generalized orthogonality constraints, including CCA, ICA, and GEVP.
We solve the PBW-like problem of normal ordering for enveloping algebras of direct sums.