The classical Heawood inequality states that if the complete graph $K_n$ on $n$ vertices is embeddable in the sphere with $g$ handles, then $g \ge\dfrac{(n-3)(n-4)}{12}$. A higher-dimensional analogue of the Heawood inequality is the K\"uhnel conjecture. In a simplified form it states that for every integer $k>0$ there is $c_k>0$ such that if the union of $k$-faces of $n$-simplex embeds into the connected sum of $g$ copies of the Cartesian product $S^k\times S^k$ of two $k$-dimensional spheres, then $g\ge c_k n^{k+1}$. For $k>1$ only linear estimates were known. We present a quadratic estimate $g\ge c_k n^2$. The proof is based on beautiful and fruitful interplay between geometric topology, combinatorics and linear algebra.
Given a graph $G$, the optimization version of the graph burning problem seeks for a sequence of vertices, $(u_1,u_2,...,u_k) \in V(G)^k$, with minimum $k$ and such that every $v \in V(G)$ has distance at most $k-i$ to some vertex $u_i$. The length $k$ of the optimal solution is known as the burning number and is denoted by $b(G)$, an invariant that helps quantify the graph's vulnerability to contagion. This paper explores the advantages and limitations of an $\mathcal{O}(mn + kn^2)$ deterministic greedy heuristic for this problem, where $n$ is the graph's order, $m$ is the graph's size, and $k$ is a guess on $b(G)$. This heuristic is based on the relationship between the graph burning problem and the clustered maximum coverage problem, and despite having limitations on paths and cycles, it found most of the optimal and best-known solutions of benchmark and synthetic graphs with up to 102400 vertices.
A tiling of a vector space $S$ is the pair $(U,V)$ of its subsets such that every vector in $S$ is uniquely represented as the sum of a vector from $U$ and a vector from $V$. A tiling is connected to a perfect codes if one of the sets, say $U$, is projective, i.e., the union of one-dimensional subspaces of $S$. A tiling $(U,V)$ is full-rank if the affine span of each of $U$, $V$ is $S$. For finite non-binary vector spaces of dimension at least $6$ (at least $10$), we construct full-rank tilings $(U,V)$ with projective $U$ (both $U$ and $V$, respectively). In particular, that construction gives a full-rank ternary $1$-perfect code of length $13$, solving a known problem. We also discuss the treatment of tilings with projective components as factorizations of projective spaces. Keywords: perfect codes, tilings, group factorization, full-rank tilings, projective geometry
We study the problem of estimating the score function of an unknown probability distribution $\rho^*$ from $n$ independent and identically distributed observations in $d$ dimensions. Assuming that $\rho^*$ is subgaussian and has a Lipschitz-continuous score function $s^*$, we establish the optimal rate of $\tilde \Theta(n^{-\frac{2}{d+4}})$ for this estimation problem under the loss function $\|\hat s - s^*\|^2_{L^2(\rho^*)}$ that is commonly used in the score matching literature, highlighting the curse of dimensionality where sample complexity for accurate score estimation grows exponentially with the dimension $d$. Leveraging key insights in empirical Bayes theory as well as a new convergence rate of smoothed empirical distribution in Hellinger distance, we show that a regularized score estimator based on a Gaussian kernel attains this rate, shown optimal by a matching minimax lower bound. We also discuss extensions to estimating $\beta$-H\"older continuous scores with $\beta \leq 1$, as well as the implication of our theory on the sample complexity of score-based generative models.
We consider the problem of fitting a centered ellipsoid to $n$ standard Gaussian random vectors in $\mathbb{R}^d$, as $n, d \to \infty$ with $n/d^2 \to \alpha > 0$. It has been conjectured that this problem is, with high probability, satisfiable (SAT; that is, there exists an ellipsoid passing through all $n$ points) for $\alpha < 1/4$, and unsatisfiable (UNSAT) for $\alpha > 1/4$. In this work we give a precise analytical argument, based on the non-rigorous replica method of statistical physics, that indeed predicts a SAT/UNSAT transition at $\alpha = 1/4$, as well as the shape of a typical fitting ellipsoid in the SAT phase (i.e., the lengths of its principal axes). Besides the replica method, our main tool is the dilute limit of extensive-rank "HCIZ integrals" of random matrix theory. We further study different explicit algorithmic constructions of the matrix characterizing the ellipsoid. In particular, we show that a procedure based on minimizing its nuclear norm yields a solution in the whole SAT phase. Finally, we characterize the SAT/UNSAT transition for ellipsoid fitting of a large class of rotationally-invariant random vectors. Our work suggests mathematically rigorous ways to analyze fitting ellipsoids to random vectors, which is the topic of a companion work.
A storage code on a graph $G$ is a set of assignments of symbols to the vertices such that every vertex can recover its value by looking at its neighbors. We consider the question of constructing large-size storage codes on triangle-free graphs constructed as coset graphs of binary linear codes. Previously it was shown that there are infinite families of binary storage codes on coset graphs with rate converging to 3/4. Here we show that codes on such graphs can attain rate asymptotically approaching 1. Equivalently, this question can be phrased as a version of hat-guessing games on graphs (e.g., P.J. Cameron e.a., \emph{Electronic J. Comb.} 2016). In this language, we construct triangle-free graphs with success probability of the players approaching one as the number of vertices tends to infinity. Furthermore, finding linear index codes of rate approaching zero is also an equivalent problem. Another family of storage codes on triangle-free graphs of rate approaching 1 was constructed earlier by A. Golovnev and I. Haviv (36th Computational Complexity Conf., 2021) relying on a different family of graphs.
We devise fast and provably accurate algorithms to transform between an $N\times N \times N$ Cartesian voxel representation of a three-dimensional function and its expansion into the ball harmonics, that is, the eigenbasis of the Dirichlet Laplacian on the unit ball in $\mathbb{R}^3$. Given $\varepsilon > 0$, our algorithms achieve relative $\ell^1$ - $\ell^\infty$ accuracy $\varepsilon$ in time $O(N^3 (\log N)^2 + N^3 |\log \varepsilon|^2)$, while their dense counterparts have time complexity $O(N^6)$. We illustrate our methods on numerical examples.
We investigate completions of partial combinatory algebras (pcas), in particular of Kleene's second model $\mathcal{K}_2$ and generalizations thereof. We consider weak and strong notions of embeddability and completion that have been studied before. By a result of Klop it is known that not every pca has a strong completion. The study of completions of $\mathcal{K}_2$ has as corollaries that weak and strong embeddings are different, and that every countable pca has a weak completion. We then consider generalizations of $\mathcal{K}_2$ for larger cardinals, and use these to show that it is consistent that every pca has a weak completion.
Suppose that $P$ is a property that may be satisfied by a random code $C \subset \Sigma^n$. For example, for some $p \in (0,1)$, ${P}$ might be the property that there exist three elements of $C$ that lie in some Hamming ball of radius $pn$. We say that $R^*$ is the threshold rate for ${P}$ if a random code of rate $R^* + \epsilon$ is very likely to satisfy ${P}$, while a random code of rate $R^* - \epsilon$ is very unlikely to satisfy ${P}$. While random codes are well-studied in coding theory, even the threshold rates for relatively simple properties like the one above are not well understood. We characterize threshold rates for a rich class of properties. These properties, like the example above, are defined by the inclusion of specific sets of codewords which are also suitably "symmetric". For properties in this class, we show that the threshold rate is in fact equal to the lower bound that a simple first-moment calculation obtains. Our techniques not only pin down the threshold rate for the property ${P}$ above, they give sharp bounds on the threshold rate for list-recovery in several parameter regimes, as well as an efficient algorithm for estimating the threshold rates for list-recovery in general.
Depth first search is a fundamental graph problem having a wide range of applications. For a graph $G=(V,E)$ having $n$ vertices and $m$ edges, the DFS tree can be computed in $O(m+n)$ using $O(m)$ space where $m=O(n^2)$. In the streaming environment, most graph problems are studied in the semi-streaming model where several passes (preferably one) are allowed over the input, allowing $O(nk)$ local space for some $k=o(n)$. Trivially, using $O(m)$ space, DFS can be computed in one pass, and using $O(n)$ space, it can be computed in $O(n)$ passes. Khan and Mehta [STACS19] presented several algorithms allowing trade-offs between space and passes, where $O(nk)$ space results in $O(n/k)$ passes. They also empirically analyzed their algorithm to require only a few passes in practice for even $O(n)$ space. Chang et al. [STACS20] presented an alternate proof for the same and also presented $O(\sqrt{n})$ pass algorithm requiring $O(n~poly\log n)$ space with a finer trade-off between space and passes. However, their algorithm uses complex black box algorithms, making it impractical. We perform an experimental analysis of the practical semi-streaming DFS algorithms. Our analysis ranges from real graphs to random graphs (uniform and power-law). We also present several heuristics to improve the state-of-the-art algorithms and study their impact. Our heuristics improve state of the art by $40-90\%$, achieving optimal one pass in almost $40-50\%$ cases (improved from zero). In random graphs, they improve from $30-90\%$, again requiring optimal one pass for even very small values of $k$. Overall, our heuristics improved the relatively complex state-of-the-art algorithm significantly, requiring merely two passes in the worst case for random graphs. Additionally, our heuristics made the relatively simpler algorithm practically usable even for very small space bounds, which was impractical earlier.
The associahedron is the graph $\mathcal{G}_N$ that has as nodes all triangulations of a convex $N$-gon, and an edge between any two triangulations that differ in a flip operation, which consists of removing an edge shared by two triangles and replacing it by the other diagonal of the resulting 4-gon. In this paper, we consider a large collection of induced subgraphs of $\mathcal{G}_N$ obtained by Ramsey-type colorability properties. Specifically, coloring the points of the $N$-gon red and blue alternatingly, we consider only colorful triangulations, namely triangulations in which every triangle has points in both colors, i.e., monochromatic triangles are forbidden. The resulting induced subgraph of $\mathcal{G}_N$ on colorful triangulations is denoted by $\mathcal{F}_N$. We prove that $\mathcal{F}_N$ has a Hamilton cycle for all $N\geq 8$, resolving a problem raised by Sagan, i.e., all colorful triangulations on $N$ points can be listed so that any two cyclically consecutive triangulations differ in a flip. In fact, we prove that for an arbitrary fixed coloring pattern of the $N$ points with at least 10 changes of color, the resulting subgraph of $\mathcal{G}_N$ on colorful triangulations (for that coloring pattern) admits a Hamilton cycle. We also provide an efficient algorithm for computing a Hamilton path in $\mathcal{F}_N$ that runs in time $\mathcal{O}(1)$ on average per generated node. This algorithm is based on a new and algorithmic construction of a tree rotation Gray code for listing all $n$-vertex $k$-ary trees that runs in time $\mathcal{O}(k)$ on average per generated tree.