Finite difference method as a popular numerical method has been widely used to solve fractional diffusion equations. In the general spatial error analyses, an assumption $u\in C^{4}(\bar{\Omega})$ is needed to preserve $\mathcal{O}(h^{2})$ convergence when using central finite difference scheme to solve fractional sub-diffusion equation with Laplace operator, but this assumption is somewhat strong, where $u$ is the exact solution and $h$ is the mesh size. In this paper, a novel analysis technique is proposed to show that the spatial convergence rate can reach $\mathcal{O}(h^{\min(\sigma+\frac{1}{2}-\epsilon,2)})$ in both $l^{2}$-norm and $l^{\infty}$-norm in one-dimensional domain when the initial value and source term are both in $\hat{H}^{\sigma}(\Omega)$ but without any regularity assumption on the exact solution, where $\sigma\geq 0$ and $\epsilon>0$ being arbitrarily small. After making slight modifications on the scheme, acting on the initial value and source term, the spatial convergence rate can be improved to $\mathcal{O}(h^{2})$ in $l^{2}$-norm and $\mathcal{O}(h^{\min(\sigma+\frac{3}{2}-\epsilon,2)})$ in $l^{\infty}$-norm. It's worth mentioning that our spatial error analysis is applicable to high dimensional cube domain by using the properties of tensor product. Moreover, two kinds of averaged schemes are provided to approximate the Riemann--Liouville fractional derivative, and $\mathcal{O}(\tau^{2})$ convergence is obtained for all $\alpha\in(0,1)$. Finally, some numerical experiments verify the effectiveness of the built theory.
We study the problem of online generalized linear regression in the stochastic setting, where the label is generated from a generalized linear model with possibly unbounded additive noise. We provide a sharp analysis of the classical follow-the-regularized-leader (FTRL) algorithm to cope with the label noise. More specifically, for $\sigma$-sub-Gaussian label noise, our analysis provides a regret upper bound of $O(\sigma^2 d \log T) + o(\log T)$, where $d$ is the dimension of the input vector, $T$ is the total number of rounds. We also prove a $\Omega(\sigma^2d\log(T/d))$ lower bound for stochastic online linear regression, which indicates that our upper bound is nearly optimal. In addition, we extend our analysis to a more refined Bernstein noise condition. As an application, we study generalized linear bandits with heteroscedastic noise and propose an algorithm based on FTRL to achieve the first variance-aware regret bound.
We study the convergence of a family of numerical integration methods where the numerical integral is formulated as a finite matrix approximation to a multiplication operator. For bounded functions, the convergence has already been established using the theory of strong operator convergence. In this article, we consider unbounded functions and domains which pose several difficulties compared to the bounded case. A natural choice of method for this study is the theory of strong resolvent convergence which has previously been mostly applied to study the convergence of approximations of differential operators. The existing theory already includes convergence theorems that can be used as proofs as such for a limited class of functions and extended for wider class of functions in terms of function growth or discontinuity. The extended results apply to all self-adjoint operators, not just multiplication operators. We also show how Jensen's operator inequality can be used to analyse the convergence of an improper numerical integral of a function bounded by an operator convex function.
Suppose we are given an $n$-dimensional order-3 symmetric tensor $T \in (\mathbb{R}^n)^{\otimes 3}$ that is the sum of $r$ random rank-1 terms. The problem of recovering the rank-1 components is possible in principle when $r \lesssim n^2$ but polynomial-time algorithms are only known in the regime $r \ll n^{3/2}$. Similar "statistical-computational gaps" occur in many high-dimensional inference tasks, and in recent years there has been a flurry of work on explaining the apparent computational hardness in these problems by proving lower bounds against restricted (yet powerful) models of computation such as statistical queries (SQ), sum-of-squares (SoS), and low-degree polynomials (LDP). However, no such prior work exists for tensor decomposition, largely because its hardness does not appear to be explained by a "planted versus null" testing problem. We consider a model for random order-3 tensor decomposition where one component is slightly larger in norm than the rest (to break symmetry), and the components are drawn uniformly from the hypercube. We resolve the computational complexity in the LDP model: $O(\log n)$-degree polynomial functions of the tensor entries can accurately estimate the largest component when $r \ll n^{3/2}$ but fail to do so when $r \gg n^{3/2}$. This provides rigorous evidence suggesting that the best known algorithms for tensor decomposition cannot be improved, at least by known approaches. A natural extension of the result holds for tensors of any fixed order $k \ge 3$, in which case the LDP threshold is $r \sim n^{k/2}$.
We derive a formula for optimal hard thresholding of the singular value decomposition in the presence of correlated additive noise; although it nominally involves unobservables, we show how to apply it even where the noise covariance structure is not a-priori known or is not independently estimable. The proposed method, which we call ScreeNOT, is a mathematically solid alternative to Cattell's ever-popular but vague Scree Plot heuristic from 1966. ScreeNOT has a surprising oracle property: it typically achieves exactly, in large finite samples, the lowest possible MSE for matrix recovery, on each given problem instance - i.e. the specific threshold it selects gives exactly the smallest achievable MSE loss among all possible threshold choices for that noisy dataset and that unknown underlying true low rank model. The method is computationally efficient and robust against perturbations of the underlying covariance structure. Our results depend on the assumption that the singular values of the noise have a limiting empirical distribution of compact support; this model, which is standard in random matrix theory, is satisfied by many models exhibiting either cross-row correlation structure or cross-column correlation structure, and also by many situations where there is inter-element correlation structure. Simulations demonstrate the effectiveness of the method even at moderate matrix sizes. The paper is supplemented by ready-to-use software packages implementing the proposed algorithm: package ScreeNOT in Python (via PyPI) and R (via CRAN).
In this paper we carefully combine Fredman's trick [SICOMP'76] and Matou\v{s}ek's approach for dominance product [IPL'91] to obtain powerful results in fine-grained complexity: - Under the hypothesis that APSP for undirected graphs with edge weights in $\{1, 2, \ldots, n\}$ requires $n^{3-o(1)}$ time (when $\omega=2$), we show a variety of conditional lower bounds, including an $n^{7/3-o(1)}$ lower bound for unweighted directed APSP and an $n^{2.2-o(1)}$ lower bound for computing the Minimum Witness Product between two $n \times n$ Boolean matrices, even if $\omega=2$, improving upon their trivial $n^2$ lower bounds. Our techniques can also be used to reduce the unweighted directed APSP problem to other problems. In particular, we show that (when $\omega = 2$), if unweighted directed APSP requires $n^{2.5-o(1)}$ time, then Minimum Witness Product requires $n^{7/3-o(1)}$ time. - We show that, surprisingly, many central problems in fine-grained complexity are equivalent to their natural counting versions. In particular, we show that Min-Plus Product and Exact Triangle are subcubically equivalent to their counting versions, and 3SUM is subquadratically equivalent to its counting version. - We obtain new algorithms using new variants of the Balog-Szemer\'edi-Gowers theorem from additive combinatorics. For example, we get an $O(n^{3.83})$ time deterministic algorithm for exactly counting the number of shortest paths in an arbitrary weighted graph, improving the textbook $\widetilde{O}(n^{4})$ time algorithm. We also get faster algorithms for 3SUM in preprocessed universes, and deterministic algorithms for 3SUM on monotone sets in $\{1, 2, \ldots, n\}^d$.
Numerical differentiation of a function, contaminated with noise, over the unit interval $[0,1] \subset \mathbb{R}$ by inverting the simple integration operator $J:L^2([0,1]) \to L^2([0,1])$ defined as $[Jx](s):=\int_0^s x(t) dt$ is discussed extensively in the literature. The complete singular system of the compact operator $J$ is explicitly given with singular values $\sigma_n(J)$ asymptotically proportional to $1/n$, which indicates a degree {\sl one} of ill-posedness for this inverse problem. We recall the concept of the degree of ill-posedness for linear operator equations with compact forward operators in Hilbert spaces. In contrast to the one-dimensional case with operator $J$, there is little material available about the analysis of the d-dimensional case, where the compact integral operator $J_d:L^2([0,1]^d) \to L^2([0,1]^d)$ defined as $[J_d\,x](s_1,\ldots,s_d):=\int_0^{s_1}\ldots\int_0^{s_d} x(t_1,\ldots,t_d)\, dt_d\ldots dt_1$ over unit $d$-cube is to be inverted. This inverse problem of mixed differentiation $x(s_1,\ldots,s_d)=\frac{\partial^d}{\partial s_1 \ldots \partial s_d} y(s_1,\ldots ,s_d)$ is of practical interest, for example when in statistics copula densities have to be verified from empirical copulas over $[0,1]^d \subset \mathbb{R}^d$. In this note, we prove that the non-increasingly ordered singular values $\sigma_n(J_d)$ of the operator $J_d$ have an asymptotics of the form $\frac{(\log n)^{d-1}}{n}$, which shows that the degree of ill-posedness stays at one, even though an additional logarithmic factor occurs. Some more discussion refers to the special case $d=2$ for characterizing the range $\mathcal{R}(J_2)$ of the operator $J_2$.
In this article we apply reduced order techniques for the approximation of parametric eigenvalue problems. The effect of the choice of sampling points is investigated. Here we use the standard proper orthogonal decomposition technique to obtain the basis of the reduced space and Galerking orthogonal technique is used to get the reduced problem. We present some numerical results and observe that the use of sparse sampling is a good idea for sampling if the dimension of parameter space is high.
In this work, we present the construction of two distinct finite element approaches to solve the Porous Medium Equation (PME). In the first approach, we transform the PME to a log-density variable formulation and construct a continuous Galerkin method. In the second approach, we introduce additional potential and velocity variables to rewrite the PME into a system of equations, for which we construct a mixed finite element method. Both approaches are first-order accurate, mass conserving, and proved to be unconditionally energy stable for their respective energies. The mixed approach is shown to preserve positivity under a CFL condition, while a much stronger property of unconditional bound preservation is proved for the log-density approach. A novel feature of our schemes is that they can handle compactly supported initial data without the need for any perturbation techniques. Furthermore, the log-density method can handle unstructured grids in any number of dimensions, while the mixed method can handle unstructured grids in two dimensions. We present results from several numerical experiments to demonstrate these properties.
In this paper we develop inference for high dimensional linear models, with serially correlated errors. We examine Lasso under the assumption of strong mixing in the covariates and error process, allowing for fatter tails in their distribution. While the Lasso estimator performs poorly under such circumstances, we estimate via GLS Lasso the parameters of interest and extend the asymptotic properties of the Lasso under more general conditions. Our theoretical results indicate that the non-asymptotic bounds for stationary dependent processes are sharper, while the rate of Lasso under general conditions appears slower as $T,p\to \infty$. Further we employ the debiased Lasso to perform inference uniformly on the parameters of interest. Monte Carlo results support the proposed estimator, as it has significant efficiency gains over traditional methods.
In this paper, we consider a new approach for semi-discretization in time and spatial discretization of a class of semi-linear stochastic partial differential equations (SPDEs) with multiplicative noise. The drift term of the SPDEs is only assumed to satisfy a one-sided Lipschitz condition and the diffusion term is assumed to be globally Lipschitz continuous. Our new strategy for time discretization is based on the Milstein method from stochastic differential equations. We use the energy method for its error analysis and show a strong convergence order of nearly $1$ for the approximate solution. The proof is based on new H\"older continuity estimates of the SPDE solution and the nonlinear term. For the general polynomial-type drift term, there are difficulties in deriving even the stability of the numerical solutions. We propose an interpolation-based finite element method for spatial discretization to overcome the difficulties. Then we obtain $H^1$ stability, higher moment $H^1$ stability, $L^2$ stability, and higher moment $L^2$ stability results using numerical and stochastic techniques. The nearly optimal convergence orders in time and space are hence obtained by coupling all previous results. Numerical experiments are presented to implement the proposed numerical scheme and to validate the theoretical results.