We present an estimator of the covariance matrix $\Sigma$ of random $d$-dimensional vector from an i.i.d. sample of size $n$. Our sole assumption is that this vector satisfies a bounded $L^p-L^2$ moment assumption over its one-dimensional marginals, for some $p\geq 4$. Given this, we show that $\Sigma$ can be estimated from the sample with the same high-probability error rates that the sample covariance matrix achieves in the case of Gaussian data. This holds even though we allow for very general distributions that may not have moments of order $>p$. Moreover, our estimator can be made to be optimally robust to adversarial contamination. This result improves the recent contributions by Mendelson and Zhivotovskiy and Catoni and Giulini, and matches parallel work by Abdalla and Zhivotovskiy (the exact relationship with this last work is described in the paper).
We analyze a goal-oriented adaptive algorithm that aims to efficiently compute the quantity of interest $G(u^\star)$ with a linear goal functional $G$ and the solution $u^\star$ to a general second-order nonsymmetric linear elliptic partial differential equation. The current state of the analysis of iterative algebraic solvers for nonsymmetric systems lacks the contraction property in the norms that are prescribed by the functional analytic setting. This seemingly prevents their application in the optimality analysis of goal-oriented adaptivity. As a remedy, this paper proposes a goal-oriented adaptive iteratively symmetrized finite element method (GOAISFEM). It employs a nested loop with a contractive symmetrization procedure, e.g., the Zarantonello iteration, and a contractive algebraic solver, e.g., an optimal multigrid solver. The various iterative procedures require well-designed stopping criteria such that the adaptive algorithm can effectively steer the local mesh refinement and the computation of the inexact discrete approximations. The main results consist of full linear convergence of the proposed adaptive algorithm and the proof of optimal convergence rates with respect to both degrees of freedom and total computational cost (i.e., optimal complexity). Numerical experiments confirm the theoretical results and investigate the selection of the parameters.
We consider the problem of distilling uniform random bits from an unknown source with a given $p$-entropy using linear hashing. As our main result, we estimate the expected $p$-divergence from the uniform distribution over the ensemble of random linear codes for all integer $p\ge 2$. The proof relies on analyzing how additive noise, determined by a random element of the code from the ensemble, acts on the source distribution. This action leads to the transformation of the source distribution into an approximately uniform one, a process commonly referred to as distribution smoothing. We also show that hashing with Reed-Muller matrices reaches intrinsic randomness of memoryless Bernoulli sources in the $l_p$ sense for all integer $p\ge 2$.
Starting from an A-stable rational approximation to $\rm{e}^z$ of order $p$, $$r(z)= 1+ z+ \cdots + z^p/ p! + O(z^{p+1}),$$ families of stable methods are proposed to time discretize abstract IVP's of the type $u'(t) = A u(t) + f(t)$. These numerical procedures turn out to be of order $p$, thus overcoming the order reduction phenomenon, and only one evaluation of $f$ per step is required.
Given a digraph $D=(V,A)$ on $n$ vertices and a vertex $v\in V$, the cycle-degree of $v$ is the minimum size of a set $S \subseteq V(D) \setminus \{v\}$ intersecting every directed cycle of $D$ containing $v$. From this definition of cycle-degree, we define the $c$-degeneracy (or cycle-degeneracy) of $D$, which we denote by $\delta^*_c(D)$. It appears to be a nice generalisation of the undirected degeneracy. In this work, using this new definition of cycle-degeneracy, we extend several evidences for Cereceda's conjecture to digraphs. The $k$-dicolouring graph of $D$, denoted by $\mathcal{D}_k(D)$, is the undirected graph whose vertices are the $k$-dicolourings of $D$ and in which two $k$-dicolourings are adjacent if they differ on the colour of exactly one vertex. We show that $\mathcal{D}_k(D)$ has diameter at most $O_{\delta^*_c(D)}(n^{\delta^*_c(D) + 1})$ (respectively $O(n^2)$ and $(\delta^*_c(D)+1)n$) when $k$ is at least $\delta^*_c(D)+2$ (respectively $\frac{3}{2}(\delta^*_c(D)+1)$ and $2(\delta^*_c(D)+1)$). This improves known results on digraph redicolouring (Bousquet et al.). Next, we extend a result due to Feghali to digraphs, showing that $\mathcal{D}_{d+1}(D)$ has diameter at most $O_{d,\epsilon}(n(\log n)^{d-1})$ when $D$ has maximum average cycle-degree at most $d-\epsilon$. We then show that two proofs of Bonamy and Bousquet for undirected graphs can be extended to digraphs. The first one uses the digrundy number of a digraph and the second one uses the $\mathscr{D}$-width. Finally, we give a general theorem which makes a connection between the recolourability of a digraph $D$ and the recolourability of its underlying graph $UG(D)$. This result directly extends a number of results on planar graph recolouring to planar digraph redicolouring.
We suggest new closely related methods for numerical inversion of $Z$-transform and Wiener-Hopf factorization of functions on the unit circle, based on sinh-deformations of the contours of integration, corresponding changes of variables and the simplified trapezoid rule. As applications, we consider evaluation of high moments of probability distributions and construction of causal filters. Programs in Matlab running on a Mac with moderate characteristics achieves the precision E-14 in several dozen of microseconds and E-11 in several milliseconds, respectively.
Let $A_\alpha$ be the semi-infinite tridiagonal matrix having subdiagonal and superdiagonal unit entries, $(A_\alpha)_{11}=\alpha$, where $\alpha\in\mathbb C$, and zero elsewhere. A basis $\{P_0,P_1,P_2,\ldots\}$ of the linear space $\mathcal P_\alpha$ spanned by the powers of $A_\alpha$ is determined, where $P_0=I$, $P_n=T_n+H_n$, $T_n$ is the symmetric Toeplitz matrix having ones in the $n$th super- and sub-diagonal, zeros elsewhere, and $H_n$ is the Hankel matrix with first row $[\theta\alpha^{n-2}, \theta\alpha^{n-3}, \ldots, \theta, \alpha, 0, \ldots]$, where $\theta=\alpha^2-1$. The set $\mathcal P_\alpha$ is an algebra, and for $\alpha\in\{-1,0,1\}$, $H_n$ has only one nonzero anti-diagonal. This fact is exploited to provide a better representation of symmetric quasi-Toeplitz matrices $\mathcal {QT}_S$, where, instead of representing a generic matrix $A\in\mathcal{QT}_S$ as $A=T+K$, where $T$ is Toeplitz and $K$ is compact, it is represented as $A=P+H$, where $P\in\mathcal P_\alpha$ and $H$ is compact. It is shown experimentally that the matrix arithmetic obtained this way is much more effective than that implemented in the CQT-Toolbox of Numer.~Algo. 81(2):741--769, 2019.
We show that the $n$'th digit of the base-$b$ representation of the golden ratio is a finite-state function of the Zeckendorf representation of $b^n$, and hence can be computed by a finite automaton. Similar results can be proven for any quadratic irrational. We use a satisfiability (SAT) solver to prove, in some cases, that the automata we construct are minimal.
The $\ell_{1,\infty}$ norm is an efficient structured projection but the complexity of the best algorithm is unfortunately $\mathcal{O}\big(n m \log(n m)\big)$ for a matrix in $\mathbb{R}^{n\times m}$. In this paper, we propose a new bi-level projection method for which we show that the time complexity for the $\ell_{1,\infty}$ norm is only $\mathcal{O}\big(n m \big)$ for a matrix in $\mathbb{R}^{n\times m}$, and $\mathcal{O}\big(n + m \big)$ with full parallel power. We generalize our method to tensors and we propose a new multi-level projection, having an induced decomposition that yields a linear parallel speedup up to an exponential speedup factor, resulting in a time complexity lower-bounded by the sum of the dimensions. Experiments show that our bi-level $\ell_{1,\infty}$ projection is $2.5$ times faster than the actual fastest algorithm provided by \textit{Chu et. al.} while providing same accuracy and better sparsity in neural networks applications.
The classical Heawood inequality states that if the complete graph $K_n$ on $n$ vertices is embeddable in the sphere with $g$ handles, then $g \ge\dfrac{(n-3)(n-4)}{12}$. A higher-dimensional analogue of the Heawood inequality is the K\"uhnel conjecture. In a simplified form it states that for every integer $k>0$ there is $c_k>0$ such that if the union of $k$-faces of $n$-simplex embeds into the connected sum of $g$ copies of the Cartesian product $S^k\times S^k$ of two $k$-dimensional spheres, then $g\ge c_k n^{k+1}$. For $k>1$ only linear estimates were known. We present a quadratic estimate $g\ge c_k n^2$. The proof is based on beautiful and fruitful interplay between geometric topology, combinatorics and linear algebra.
Optimization over the set of matrices that satisfy $X^\top B X = I_p$, referred to as the generalized Stiefel manifold, appears in many applications involving sampled covariance matrices such as canonical correlation analysis (CCA), independent component analysis (ICA), and the generalized eigenvalue problem (GEVP). Solving these problems is typically done by iterative methods, such as Riemannian approaches, which require a computationally expensive eigenvalue decomposition involving fully formed $B$. We propose a cheap stochastic iterative method that solves the optimization problem while having access only to a random estimate of the feasible set. Our method does not enforce the constraint in every iteration exactly, but instead it produces iterations that converge to a critical point on the generalized Stiefel manifold defined in expectation. The method has lower per-iteration cost, requires only matrix multiplications, and has the same convergence rates as its Riemannian counterparts involving the full matrix $B$. Experiments demonstrate its effectiveness in various machine learning applications involving generalized orthogonality constraints, including CCA, ICA, and GEVP.