Without higher moment assumptions, this note establishes the decay of the Kolmogorov distance in a central limit theorem for L\'evy processes. This theorem can be viewed as a continuous-time extension of the classical random walk result by Friedman, Katz and Koopmans.
We consider the deviation inequalities for the sums of independent $d$ by $d$ random matrices, as well as rank one random tensors. Our focus is on the non-isotropic case and the bounds that do not depend explicitly on the dimension $d$, but rather on the effective rank. In a rather elementary and unified way, we show the following results: 1) A deviation bound for the sums of independent positive-semi-definite matrices of any rank. This result generalizes the dimension-free bound of Koltchinskii and Lounici [Bernoulli, 23(1): 110-133, 2017] on the sample covariance matrix in the sub-Gaussian case. 2) Dimension-free bounds for the operator norm of the sums of random tensors of rank one formed either by sub-Gaussian or log-concave random vectors. This extends the result of Guedon and Rudelson [Adv. in Math., 208: 798-823, 2007]. 3) A non-isotropic version of the result of Alesker [Geom. Asp. of Funct. Anal., 77: 1--4, 1995] on the concentration of the norm of sub-exponential random vectors. 4) A dimension-free lower tail bound for sums of positive semi-definite matrices with heavy-tailed entries, sharpening the bound of Oliveira [Prob. Th. and Rel. Fields, 166: 1175-1194, 2016]. Our approach is based on the duality formula between entropy and moment generating functions. In contrast to the known proofs of dimension-free bounds, we avoid Talagrand's majorizing measure theorem, as well as generic chaining bounds for empirical processes. Some of our tools were pioneered by O. Catoni and co-authors in the context of robust statistical estimation.
The Chebyshev or $\ell_{\infty}$ estimator is an unconventional alternative to the ordinary least squares in solving linear regressions. It is defined as the minimizer of the $\ell_{\infty}$ objective function \begin{align*} \hat{\boldsymbol{\beta}} := \arg\min_{\boldsymbol{\beta}} \|\boldsymbol{Y} - \mathbf{X}\boldsymbol{\beta}\|_{\infty}. \end{align*} The asymptotic distribution of the Chebyshev estimator under fixed number of covariates were recently studied (Knight, 2020), yet finite sample guarantees and generalizations to high-dimensional settings remain open. In this paper, we develop non-asymptotic upper bounds on the estimation error $\|\hat{\boldsymbol{\beta}}-\boldsymbol{\beta}^*\|_2$ for a Chebyshev estimator $\hat{\boldsymbol{\beta}}$, in a regression setting with uniformly distributed noise $\varepsilon_i\sim U([-a,a])$ where $a$ is either known or unknown. With relatively mild assumptions on the (random) design matrix $\mathbf{X}$, we can bound the error rate by $\frac{C_p}{n}$ with high probability, for some constant $C_p$ depending on the dimension $p$ and the law of the design. Furthermore, we illustrate that there exist designs for which the Chebyshev estimator is (nearly) minimax optimal. In addition we show that "Chebyshev's LASSO" has advantages over the regular LASSO in high dimensional situations, provided that the noise is uniform. Specifically, we argue that it achieves a much faster rate of estimation under certain assumptions on the growth rate of the sparsity level and the ambient dimension with respect to the sample size.
In this paper, we propose a general meta learning approach to computing approximate Nash equilibrium for finite $n$-player normal-form games. Unlike existing solutions that approximate or learn a Nash equilibrium from scratch for each of the games, our meta solver directly constructs a mapping from a game utility matrix to a joint strategy profile. The mapping is parameterized and learned in a self-supervised fashion by a proposed Nash equilibrium approximation metric without ground truth data informing any Nash equilibrium. As such, it can immediately predict the joint strategy profile that approximates a Nash equilibrium for any unseen new game under the same game distribution. Moreover, the meta-solver can be further fine-tuned and adaptive to a new game if iteration updates are allowed. We theoretically prove that our meta-solver is not affected by the non-smoothness of exact Nash equilibrium solutions, and derive a sample complexity bound to demonstrate its generalization ability across normal-form games. Experimental results demonstrate its substantial approximation power against other strong baselines in both adaptive and non-adaptive cases.
One of the main reasons for topological persistence being useful in data analysis is that it is backed up by a stability (isometry) property: persistence diagrams of $1$-parameter persistence modules are stable in the sense that the bottleneck distance between two diagrams equals the interleaving distance between their generating modules. However, in multi-parameter setting this property breaks down in general. A simple special case of persistence modules called rectangle decomposable modules is known to admit a weaker stability property. Using this fact, we derive a stability-like property for $2$-parameter persistence modules. For this, first we consider interval decomposable modules and their optimal approximations with rectangle decomposable modules with respect to the bottleneck distance. We provide a polynomial time algorithm to exactly compute this optimal approximation which, together with the polynomial-time computable bottleneck distance among interval decomposable modules, provides a lower bound on the interleaving distance. Next, we leverage this result to derive a polynomial-time computable distance for general multi-parameter persistence modules which enjoys similar stability-like property. This distance can be viewed as a generalization of the matching distance defined in the literature.
Suppose that particles are randomly distributed in $\bR^d$, and they are subject to identical stochastic motion independently of each other. The Smoluchowski process describes fluctuations of the number of particles in an observation region over time. This paper studies properties of the Smoluchowski processes and considers related statistical problems. In the first part of the paper we revisit probabilistic properties of the Smoluchowski process in a unified and principled way: explicit formulas for generating functionals and moments are derived, conditions for stationarity and Gaussian approximation are discussed, and relations to other stochastic models are highlighted. The second part deals with statistics of the Smoluchowki processes. We consider two different models of the particle displacement process: the undeviated uniform motion (when a particle moves with random constant velocity along a straight line) and the Brownian motion displacement. In the setting of the undeviated uniform motion we study the problems of estimating the mean speed and the speed distribution, while for the Brownian displacement model the problem of estimating the diffusion coefficient is considered. In all these settings we develop estimators with provable accuracy guarantees.
In this paper, we study the weight spectrum of linear codes with \emph{super-linear} field size and use the probabilistic method to show that for nearly all such codes, the corresponding weight spectrum is very close to that of a maximum distance separable (MDS) code.
In this work, we study the class of stochastic process that generalizes the Ornstein-Uhlenbeck processes, hereafter called by \emph{Generalized Ornstein-Uhlenbeck Type Process} and denoted by GOU type process. We consider them driven by the class of noise processes such as Brownian motion, symmetric $\alpha$-stable L\'evy process, a L\'evy process, and even a Poisson process. We give necessary and sufficient conditions under the memory kernel function for the time-stationary and the Markov properties for these processes. When the GOU type process is driven by a L\'evy noise we prove that it is infinitely divisible showing its generating triplet. Several examples derived from the GOU type process are illustrated showing some of their basic properties as well as some time series realizations. These examples also present their theoretical and empirical autocorrelation or normalized codifference functions depending on whether the process has a finite or infinite second moment. We also present the maximum likelihood estimation as well as the Bayesian estimation procedures for the so-called \emph{Cosine process}, a particular process in the class of GOU type processes. For the Bayesian estimation method, we consider the power series representation of Fox's H-function to better approximate the density function of a random variable $\alpha$-stable distributed. We consider four goodness-of-fit tests for helping to decide which \emph{Cosine process} (driven by a Gaussian or an $\alpha$-stable noise) best fit real data sets. Two applications of GOU type model are presented: one based on the Apple company stock market price data and the other based on the cardiovascular mortality in Los Angeles County data.
We lay the foundations of a new theory for algorithms and computational complexity by parameterizing the instances of a computational problem as a moduli scheme. Considering the geometry of the scheme associated to 3-SAT, we separate P and NP.
Approximate Message Passing (AMP) algorithms have seen widespread use across a variety of applications. However, the precise forms for their Onsager corrections and state evolutions depend on properties of the underlying random matrix ensemble, limiting the extent to which AMP algorithms derived for white noise may be applicable to data matrices that arise in practice. In this work, we study more general AMP algorithms for random matrices $W$ that satisfy orthogonal rotational invariance in law, where $W$ may have a spectral distribution that is different from the semicircle and Marcenko-Pastur laws characteristic of white noise. The Onsager corrections and state evolutions in these algorithms are defined by the free cumulants or rectangular free cumulants of the spectral distribution of $W$. Their forms were derived previously by Opper, \c{C}akmak, and Winther using non-rigorous dynamic functional theory techniques, and we provide rigorous proofs. Our motivating application is a Bayes-AMP algorithm for Principal Components Analysis, when there is prior structure for the principal components (PCs) and possibly non-white noise. For sufficiently large signal strengths and any non-Gaussian prior distributions for the PCs, we show that this algorithm provably achieves higher estimation accuracy than the sample PCs.
In order to avoid the curse of dimensionality, frequently encountered in Big Data analysis, there was a vast development in the field of linear and nonlinear dimension reduction techniques in recent years. These techniques (sometimes referred to as manifold learning) assume that the scattered input data is lying on a lower dimensional manifold, thus the high dimensionality problem can be overcome by learning the lower dimensionality behavior. However, in real life applications, data is often very noisy. In this work, we propose a method to approximate $\mathcal{M}$ a $d$-dimensional $C^{m+1}$ smooth submanifold of $\mathbb{R}^n$ ($d \ll n$) based upon noisy scattered data points (i.e., a data cloud). We assume that the data points are located "near" the lower dimensional manifold and suggest a non-linear moving least-squares projection on an approximating $d$-dimensional manifold. Under some mild assumptions, the resulting approximant is shown to be infinitely smooth and of high approximation order (i.e., $O(h^{m+1})$, where $h$ is the fill distance and $m$ is the degree of the local polynomial approximation). The method presented here assumes no analytic knowledge of the approximated manifold and the approximation algorithm is linear in the large dimension $n$. Furthermore, the approximating manifold can serve as a framework to perform operations directly on the high dimensional data in a computationally efficient manner. This way, the preparatory step of dimension reduction, which induces distortions to the data, can be avoided altogether.