Random subspaces $X$ of $\mathbb{R}^n$ of dimension proportional to $n$ are, with high probability, well-spread with respect to the $\ell_p$-norm (for $p \in [1,2]$). Namely, every nonzero $x \in X$ is "robustly non-sparse" in the following sense: $x$ is $\varepsilon \|x\|_p$-far in $\ell_p$-distance from all $\delta n$-sparse vectors, for positive constants $\varepsilon, \delta$ bounded away from $0$. This "$\ell_p$-spread" property is the natural counterpart, for subspaces over the reals, of the minimum distance of linear codes over finite fields, and, for $p = 2$, corresponds to $X$ being a Euclidean section of the $\ell_1$ unit ball. Explicit $\ell_p$-spread subspaces of dimension $\Omega(n)$, however, are not known except for $p=1$. The construction for $p=1$, as well as the best known constructions for $p \in (1,2]$ (which achieve weaker spread properties), are analogs of low density parity check (LDPC) codes over the reals, i.e., they are kernels of sparse matrices. We study the spread properties of the kernels of sparse random matrices. Rather surprisingly, we prove that with high probability such subspaces contain vectors $x$ that are $o(1)\cdot \|x\|_2$-close to $o(n)$-sparse with respect to the $\ell_2$-norm, and in particular are not $\ell_2$-spread. On the other hand, for $p < 2$ we prove that such subspaces are $\ell_p$-spread with high probability. Moreover, we show that a random sparse matrix has the stronger restricted isometry property (RIP) with respect to the $\ell_p$ norm, and this follows solely from the unique expansion of a random biregular graph, yielding a somewhat unexpected generalization of a similar result for the $\ell_1$ norm [BGI+08]. Instantiating this with explicit expanders, we obtain the first explicit constructions of $\ell_p$-spread subspaces and $\ell_p$-RIP matrices for $1 \leq p < p_0$, where $1 < p_0 < 2$ is an absolute constant.
Let $k,\ell\geq 2$ be two multiplicatively independent integers. Cobham's famous theorem states that a set $X\subseteq \mathbb{N}$ is both $k$-recognizable and $\ell$-recognizable if and only if it is definable in Presburger arithmetic. Here we show the following strengthening: let $X\subseteq \mathbb{N}^m$ be $k$-recognizable, let $Y\subseteq \mathbb{N}^m$ be $\ell$-recognizable such that both $X$ and $Y$ are not definable in Presburger arithmetic. Then the first-order logical theory of $(\mathbb{N},+,X,Y)$ is undecidable. This is in contrast to a well-known theorem of B\"uchi that the first-order logical theory of $(\mathbb{N},+,X)$ is decidable.
Spectral projectors of Hermitian matrices play a key role in many applications, and especially in electronic structure computations. Linear scaling methods for gapped systems are based on the fact that these special matrix functions are localized, which means that the entries decay exponentially away from the main diagonal or with respect to more general sparsity patterns. The relation with the sign function together with an integral representation is used to obtain new decay bounds, which turn out to be optimal in an asymptotic sense. The influence of isolated eigenvalues in the spectrum on the decay properties is also investigated and a superexponential behaviour is predicted.
This paper is devoted to the estimation of the minimal dimension P of the state-space realizations of a high-dimensional time series y, defined as a noisy version (the noise is white and Gaussian) of a useful signal with low rank rational spectral density, in the high-dimensional asymptotic regime where the number of available samples N and the dimension of the time series M converge towards infinity at the same rate. In the classical low-dimensional regime, P is estimated as the number of significant singular values of the empirical autocovariance matrix between the past and the future of y, or as the number of significant estimated canonical correlation coefficients between the past and the future of y. Generalizing large random matrix methods developed in the past to analyze classical spiked models, the behaviour of the above singular values and canonical correlation coefficients is studied in the high-dimensional regime. It is proved that they are smaller than certain thresholds depending on the statistics of the noise, except a finite number of outliers that are due to the useful signal. The number of singular values of the sample autocovariance matrix above the threshold is evaluated, is shown to be almost independent from P in general, and cannot therefore be used to estimate P accurately. In contrast, the number s of canonical correlation coefficients larger than the corresponding threshold is shown to be less than or equal to P, and explicit conditions under which it is equal to P are provided. Under the corresponding assumptions, s is thus a consistent estimate of P in the high-dimensional regime. The core of the paper is the development of the necessary large random matrix tools.
We provide (high probability) bounds on the condition number of random feature matrices. In particular, we show that if the complexity ratio $\frac{N}{m}$ where $N$ is the number of neurons and $m$ is the number of data samples scales like $\log^{-3}(N)$ or $\log^{3}(m)$, then the random feature matrix is well-conditioned. This result holds without the need of regularization and relies on establishing a bound on the restricted isometry constant of the random feature matrix. In addition, we prove that the risk associated with regression problems using a random feature matrix exhibits the double descent phenomenon and that this is an effect of the double descent behavior of the condition number. The risk bounds include the underparameterized setting using the least squares problem and the overparameterized setting where using either the minimum norm interpolation problem or a sparse regression problem. For the least squares or sparse regression cases, we show that the risk decreases as $m$ and $N$ increase, even in the presence of bounded or random noise. The risk bound matches the optimal scaling in the literature and the constants in our results are explicit and independent of the dimension of the data.
We analyze when an arbitrary matrix pencil is equivalent to a dissipative Hamiltonian pencil and show that this heavily restricts the spectral properties. In order to relax the spectral properties, we introduce matrix pencils with coefficients that have positive semidefinite Hermitian parts. We will make a detailed analysis of their spectral properties and their numerical range. In particular, we relate the Kronecker structure of these pencils to that of an underlying skew-Hermitian pencil and discuss their regularity, index, numerical range, and location of eigenvalues. Further, we study matrix polynomials with positive semidefinite Hermitian coefficients and use linearizations with positive semidefinite Hermitian parts to derive sufficient conditions for a spectrum in the left half plane and derive bounds on the index.
We develop a novel connection between discrepancy minimization and (quantum) communication complexity. As an application, we resolve a substantial special case of the Matrix Spencer conjecture. In particular, we show that for every collection of symmetric $n \times n$ matrices $A_1,\ldots,A_n$ with $\|A_i\| \leq 1$ and $\|A_i\|_F \leq n^{1/4}$ there exist signs $x \in \{ \pm 1\}^n$ such that the maximum eigenvalue of $\sum_{i \leq n} x_i A_i$ is at most $O(\sqrt n)$. We give a polynomial-time algorithm based on partial coloring and semidefinite programming to find such $x$. Our techniques open a new avenue to use tools from communication complexity and information theory to study discrepancy. The proof of our main result combines a simple compression scheme for transcripts of repeated (quantum) communication protocols with quantum state purification, the Holevo bound from quantum information, and tools from sketching and dimensionality reduction. Our approach also offers a promising avenue to resolve the Matrix Spencer conjecture completely -- we show it is implied by a natural conjecture in quantum communication complexity.
Suppose an oracle knows a string $S$ that is unknown to us and that we want to determine. The oracle can answer queries of the form "Is $s$ a substring of $S$?". In 1995, Skiena and Sundaram showed that, in the worst case, any algorithm needs to ask the oracle $\sigma n/4 -O(n)$ queries in order to be able to reconstruct the hidden string, where $\sigma$ is the size of the alphabet of $S$ and $n$ its length, and gave an algorithm that spends $(\sigma-1)n+O(\sigma \sqrt{n})$ queries to reconstruct $S$. The main contribution of our paper is to improve the above upper-bound in the context where the string is compressible. We first present a universal algorithm that, given a (computable) compressor that compresses the string to $\tau$ bits, performs $q=O(\tau)$ substring queries; this algorithm, however, runs in exponential time. For this reason, the second part of the paper focuses on more time-efficient algorithms whose number of queries is bounded by specific compressibility measures. We first show that any string of length $n$ over an integer alphabet of size $\sigma$ with $rle$ runs can be reconstructed with $q=O(rle (\sigma + \log \frac{n}{rle}))$ substring queries in linear time and space. We then present an algorithm that spends $q \in O(\sigma g\log n)$ substring queries and runs in $O(n(\log n + \log \sigma)+ q)$ time using linear space, where $g$ is the size of a smallest straight-line program generating the string.
We study a matrix recovery problem with unknown correspondence: given the observation matrix $M_o=[A,\tilde P B]$, where $\tilde P$ is an unknown permutation matrix, we aim to recover the underlying matrix $M=[A,B]$. Such problem commonly arises in many applications where heterogeneous data are utilized and the correspondence among them are unknown, e.g., due to privacy concerns. We show that it is possible to recover $M$ via solving a nuclear norm minimization problem under a proper low-rank condition on $M$, with provable non-asymptotic error bound for the recovery of $M$. We propose an algorithm, $\text{M}^3\text{O}$ (Matrix recovery via Min-Max Optimization) which recasts this combinatorial problem as a continuous minimax optimization problem and solves it by proximal gradient with a Max-Oracle. $\text{M}^3\text{O}$ can also be applied to a more general scenario where we have missing entries in $M_o$ and multiple groups of data with distinct unknown correspondence. Experiments on simulated data, the MovieLens 100K dataset and Yale B database show that $\text{M}^3\text{O}$ achieves state-of-the-art performance over several baselines and can recover the ground-truth correspondence with high accuracy.
Age of Information (AoI) has emerged as an important quality-of-service measure for applications that prioritize delivery of the freshest information, e.g., virtual or augmented reality over mobile devices and wireless sensor networks used in the control of cyber-physical systems. We derive the Laplace transform of the stationary AoI for the M/GI/1/2 system with a "dynamic" service preemption and pushout policy depending on the existing service time of the in-service message. Thus, our system generalizes both the static M/GI/1/2 queue-pushout system without service preemption and the M/GI/1/1 bufferless system with service preemption - two systems considered to provide very good AoI performance. Based on our analysis, for a service-time distribution that is a mixture of deterministic and exponential, we numerically show that the dynamic policy has lower mean AoI than that of these two static policies and also that of the well studied M/GI/1/1 blocking system.
Consider the empirical autocovariance matrix at a given non-zero time lag based on observations from a multivariate complex Gaussian stationary time series. The spectral analysis of these autocovariance matrices can be useful in certain statistical problems, such as those related to testing for white noise. We study the behavior of their spectral measures in the asymptotic regime where the time series dimension and the observation window length both grow to infinity, and at the same rate. Following a general framework in the field of the spectral analysis of large random non-Hermitian matrices, at first the probabilistic behavior of the small singular values of the shifted versions of the autocovariance matrix are obtained. This is then used to infer about the large sample behaviour of the empirical spectral measure of the autocovariance matrices at any lag. Matrix orthogonal polynomials on the unit circle play a crucial role in our study.