Consider a system of $m$ polynomial equations $\{p_i(x) = b_i\}_{i \leq m}$ of degree $D\geq 2$ in $n$-dimensional variable $x \in \mathbb{R}^n$ such that each coefficient of every $p_i$ and $b_i$s are chosen at random and independently from some continuous distribution. We study the basic question of determining the smallest $m$ -- the algorithmic threshold -- for which efficient algorithms can find refutations (i.e. certificates of unsatisfiability) for such systems. This setting generalizes problems such as refuting random SAT instances, low-rank matrix sensing and certifying pseudo-randomness of Goldreich's candidate generators and generalizations. We show that for every $d \in \mathbb{N}$, the $(n+m)^{O(d)}$-time canonical sum-of-squares (SoS) relaxation refutes such a system with high probability whenever $m \geq O(n) \cdot (\frac{n}{d})^{D-1}$. We prove a lower bound in the restricted low-degree polynomial model of computation which suggests that this trade-off between SoS degree and the number of equations is nearly tight for all $d$. We also confirm the predictions of this lower bound in a limited setting by showing a lower bound on the canonical degree-$4$ sum-of-squares relaxation for refuting random quadratic polynomials. Together, our results provide evidence for an algorithmic threshold for the problem at $m \gtrsim \widetilde{O}(n) \cdot n^{(1-\delta)(D-1)}$ for $2^{n^{\delta}}$-time algorithms for all $\delta$.
The canonical polyadic decomposition (CPD) of a low rank tensor plays a major role in data analysis and signal processing by allowing for unique recovery of underlying factors. However, it is well known that the low rank CPD approximation problem is ill-posed. That is, a tensor may fail to have a best rank $R$ CPD approximation when $R>1$. This article gives deterministic bounds for the existence of best low rank tensor approximations over $\mathbb{K}=\mathbb{R}$ or $\mathbb{K}=\mathbb{C}$. More precisely, given a tensor $\mathcal{T} \in \mathbb{K}^{I \times I \times I}$ of rank $R \leq I$, we compute the radius of a Frobenius norm ball centered at $\mathcal{T}$ in which best $\mathbb{K}$-rank $R$ approximations are guaranteed to exist. In addition we show that every $\mathbb{K}$-rank $R$ tensor inside of this ball has a unique canonical polyadic decomposition. This neighborhood may be interpreted as a neighborhood of "mathematical truth" in with CPD approximation and computation is well-posed. In pursuit of these bounds, we describe low rank tensor decomposition as a ``joint generalized eigenvalue" problem. Using this framework, we show that, under mild assumptions, a low rank tensor which has rank strictly greater than border rank is defective in the sense of algebraic and geometric multiplicities for joint generalized eigenvalues. Bounds for existence of best low rank approximations are then obtained by establishing perturbation theoretic results for the joint generalized eigenvalue problem. In this way we establish a connection between existence of best low rank approximations and the tensor spectral norm. In addition we solve a "tensor Procrustes problem" which examines orthogonal compressions for pairs of tensors. The main results of the article are illustrated by a variety of numerical experiments.
We prove that the number of unit distances among $n$ planar points is at most $1.94\cdot n^{4/3}$, improving on the previous best bound of $8n^{4/3}$. We also give better upper and lower bounds for several small values of $n$. We also prove some variants of the crossing lemma and improve some constant factors.
The random order graph streaming model has received significant attention recently, with problems such as matching size estimation, component counting, and the evaluation of bounded degree constant query testable properties shown to admit surprisingly space efficient algorithms. The main result of this paper is a space efficient single pass random order streaming algorithm for simulating nearly independent random walks that start at uniformly random vertices. We show that the distribution of $k$-step walks from $b$ vertices chosen uniformly at random can be approximated up to error $\varepsilon$ per walk using $(1/\varepsilon)^{O(k)} 2^{O(k^2)}\cdot b$ words of space with a single pass over a randomly ordered stream of edges, solving an open problem of Peng and Sohler [SODA `18]. Applications of our result include the estimation of the average return probability of the $k$-step walk (the trace of the $k^\text{th}$ power of the random walk matrix) as well as the estimation of PageRank. We complement our algorithm with a strong impossibility result for directed graphs.
Clustering is an important task with applications in many fields of computer science. We study the fully dynamic setting in which we want to maintain good clusters efficiently when input points (from a metric space) can be inserted and deleted. Many clustering problems are $\mathsf{APX}$-hard but admit polynomial time $O(1)$-approximation algorithms. Thus, it is a natural question whether we can maintain $O(1)$-approximate solutions for them in subpolynomial update time, against adaptive and oblivious adversaries. Only a few results are known that give partial answers to this question. There are dynamic algorithms for $k$-center, $k$-means, and $k$-median that maintain constant factor approximations in expected $\tilde{O}(k^{2})$ update time against an oblivious adversary. However, for these problems there are no algorithms known with an update time that is subpolynomial in $k$, and against an adaptive adversary there are even no (non-trivial) dynamic algorithms known at all. In this paper, we complete the picture of the question above for all these clustering problems. 1. We show that there is no fully dynamic $O(1)$-approximation algorithm for any of the classic clustering problems above with an update time in $n^{o(1)}h(k)$ against an adaptive adversary, for an arbitrary function $h$. 2. We give a lower bound of $\Omega(k)$ on the update time for each of the above problems, even against an oblivious adversary. 3. We give the first $O(1)$-approximate fully dynamic algorithms for $k$-sum-of-radii and for $k$-sum-of-diameters with expected update time of $\tilde{O}(k^{O(1)})$ against an oblivious adversary. 4. Finally, for $k$-center we present a fully dynamic $(6+\epsilon)$-approximation algorithm with an expected update time of $\tilde{O}(k)$ against an oblivious adversary.
We study the mathematical structure of the solution set (and its tangent space) to the matrix equation $G^*JG=J$ for a given square matrix $J$. In the language of pure mathematics, this is a Lie group which is the isometry group for a bilinear (or a sesquilinear) form. Generally these groups are described as intersections of a few special groups. The tangent space to $\{G: G^*JG=J \}$ consists of solutions to the linear matrix equation $X^*J+JX=0$. For the complex case, the solution set of this linear equation was computed by De Ter{\'a}n and Dopico. We found that on its own, the equation $X^*J+JX=0$ is hard to solve. By throwing into the mix the complementary linear equation $X^*J-JX=0$, we find that rather than increasing the complexity, we reduce the complexity. Not only is it possible to now solve the original problem, but we can approach the broader algebraic and geometric structure. One implication is that the two equations form an $\mathfrak{h}$ and $\mathfrak{m}$ pair familiar in the study of pseudo-Riemannian symmetric spaces. We explicitly demonstrate the computation of the solutions to the equation $X^*J\pm XJ=0$ for real and complex matrices. However, any real, complex or quaternionic case with an arbitrary involution (e.g., transpose, conjugate transpose, and the various quaternion transposes) can be effectively solved with the same strategy. We provide numerical examples and visualizations.
Coloring unit-disk graphs efficiently is an important problem in the global and distributed setting, with applications in radio channel assignment problems when the communication relies on omni-directional antennas of the same power. In this context it is important to bound not only the complexity of the coloring algorithms, but also the number of colors used. In this paper, we consider two natural distributed settings. In the location-aware setting (when nodes know their coordinates in the plane), we give a constant time distributed algorithm coloring any unit-disk graph $G$ with at most $4\omega(G)$ colors, where $\omega(G)$ is the clique number of $G$. This improves upon a classical 3-approximation algorithm for this problem, for all unit-disk graphs whose chromatic number significantly exceeds their clique number. When nodes do not know their coordinates in the plane, we give a distributed algorithm in the LOCAL model that colors every unit-disk graph $G$ with at most $5.68\omega(G)$ colors in $O(\log^3 \log n)$ rounds. Moreover, when $\omega(G)=O(1)$, the algorithm runs in $O(\log^* n)$ rounds. This algorithm is based on a study of the local structure of unit-disk graphs, which is of independent interest. We conjecture that every unit-disk graph $G$ has average degree at most $4\omega(G)$, which would imply the existence of a $O(\log n)$ round algorithm coloring any unit-disk graph $G$ with (approximately) $4\omega(G)$ colors in the LOCAL model.
We consider a problem introduced by Feige, Gamarnik, Neeman, R\'acz and Tetali [2020], that of finding a large clique in a random graph $G\sim G(n,\frac{1}{2})$, where the graph $G$ is accessible by queries to entries of its adjacency matrix. The query model allows some limited adaptivity, with a constant number of rounds of queries, and $n^\delta$ queries in each round. With high probability, the maximum clique in $G$ is of size roughly $2\log n$, and the goal is to find cliques of size $\alpha\log n$, for $\alpha$ as large as possible. We prove that no two-rounds algorithm is likely to find a clique larger than $\frac{4}{3}\delta\log n$, which is a tight upper bound when $1\leq\delta\leq \frac{6}{5}$. For other ranges of parameters, namely, two-rounds with $\frac{6}{5}<\delta<2$, and three-rounds with $1\leq\delta<2$, we improve over the previously known upper bounds on $\alpha$, but our upper bounds are not tight. If early rounds are restricted to have fewer queries than the last round, then for some such restrictions we do prove tight upper bounds.
List-decoding and list-recovery are important generalizations of unique decoding that received considerable attention over the years. However, the optimal trade-off among list-decoding (resp. list-recovery) radius, list size, and the code rate are not fully understood in both problems. This paper takes a step towards this direction when the list size is a given constant and the alphabet size is large (as a function of the code length). We prove a new Singleton-type upper bound for list-decodable codes, which improves upon the previously known bound by roughly a factor of $1/L$, where $L$ is the list size. We also prove a Singleton-type upper bound for list-recoverable codes, which is to the best of our knowledge, the first such bound for list-recovery. We apply these results to obtain new lower bounds that are optimal up to a multiplicative constant on the list size for list-decodable and list-recoverable codes with rates approaching capacity. Moreover, we show that list-decodable \emph{nonlinear} codes can strictly outperform list-decodable linear codes. More precisely, we show that there is a gap for a wide range of parameters, which grows fast with the alphabet size, between the size of the largest list-decodable nonlinear code and the size of the largest list-decodable linear codes. This is achieved by a novel connection between list-decoding and the notion of sparse hypergraphs in extremal combinatorics. We remark that such a gap is not known to exist in the problem of unique decoding. Lastly, we show that list-decodability or recoverability of codes implies in some sense good unique decodability.
We consider mixtures of $k\geq 2$ Gaussian components with unknown means and unknown covariance (identical for all components) that are well-separated, i.e., distinct components have statistical overlap at most $k^{-C}$ for a large enough constant $C\ge 1$. Previous statistical-query lower bounds [DKS17] give formal evidence that even distinguishing such mixtures from (pure) Gaussians may be exponentially hard (in $k$). We show that this kind of hardness can only appear if mixing weights are allowed to be exponentially small, and that for polynomially lower bounded mixing weights non-trivial algorithmic guarantees are possible in quasi-polynomial time. Concretely, we develop an algorithm based on the sum-of-squares method with running time quasi-polynomial in the minimum mixing weight. The algorithm can reliably distinguish between a mixture of $k\ge 2$ well-separated Gaussian components and a (pure) Gaussian distribution. As a certificate, the algorithm computes a bipartition of the input sample that separates a pair of mixture components, i.e., both sides of the bipartition contain most of the sample points of at least one component. For the special case of colinear means, our algorithm outputs a $k$ clustering of the input sample that is approximately consistent with the components of the mixture. A significant challenge for our results is that they appear to be inherently sensitive to small fractions of adversarial outliers unlike most previous results for Gaussian mixtures. The reason is that such outliers can simulate exponentially small mixing weights even for mixtures with polynomially lower bounded mixing weights. A key technical ingredient is a characterization of separating directions for well-separated Gaussian components in terms of ratios of polynomials that correspond to moments of two carefully chosen orders logarithmic in the minimum mixing weight.
In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.