We introduce a new class of algorithms for finding a short vector in lattices defined by codes of co-dimension $k$ over $\mathbb{Z}_P^d$, where $P$ is prime. The co-dimension $1$ case is solved by exploiting the packing properties of the projections mod $P$ of an initial set of non-lattice vectors onto a single dual codeword. The technical tools we introduce are sorting of the projections followed by single-step pairwise Euclidean reduction of the projections, resulting in monotonic convergence of the positive-valued projections to zero. The length of vectors grows by a geometric factor each iteration. For fixed $P$ and $d$, and large enough user-defined input sets, we show that it is possible to minimize the number of iterations, and thus the overall length expansion factor, to obtain a short lattice vector. Thus we obtain a novel approach for controlling the output length, which resolves an open problem posed by Noah Stephens-Davidowitz (the possibility of an approximation scheme for the shortest-vector problem (SVP) which does not reduce to near-exact SVP). In our approach, one may obtain short vectors even when the lattice dimension is quite large, e.g., 8000. For fixed $P$, the algorithm yields shorter vectors for larger $d$. We additionally present a number of extensions and generalizations of our fundamental co-dimension $1$ method. These include a method for obtaining many different lattice vectors by multiplying the dual codeword by an integer and then modding by $P$; a co-dimension $k$ generalization; a large input set generalization; and finally, a "block" generalization, which involves the replacement of pairwise (Euclidean) reduction by a $k$-party (non-Euclidean) reduction. The $k$-block generalization of our algorithm constitutes a class of polynomial-time algorithms indexed by $k\geq 2$, which yield successively improved approximations for the short vector problem.
In 1991, Brenier proved a theorem that generalizes the $QR$ decomposition for square matrices -- factored as PSD $\times$ unitary -- to any vector field $F:\mathbb{R}^d\rightarrow \mathbb{R}^d$. The theorem, known as the polar factorization theorem, states that any field $F$ can be recovered as the composition of the gradient of a convex function $u$ with a measure-preserving map $M$, namely $F=\nabla u \circ M$. We propose a practical implementation of this far-reaching theoretical result, and explore possible uses within machine learning. The theorem is closely related to optimal transport (OT) theory, and we borrow from recent advances in the field of neural optimal transport to parameterize the potential $u$ as an input convex neural network. The map $M$ can be either evaluated pointwise using $u^*$, the convex conjugate of $u$, through the identity $M=\nabla u^* \circ F$, or learned as an auxiliary network. Because $M$ is, in general, not injective, we consider the additional task of estimating the ill-posed inverse map that can approximate the pre-image measure $M^{-1}$ using a stochastic generator. We illustrate possible applications of \citeauthor{Brenier1991PolarFA}'s polar factorization to non-convex optimization problems, as well as sampling of densities that are not log-concave.
An \emph{eight-partition} of a finite set of points (respectively, of a continuous mass distribution) in $\mathbb{R}^3$ consists of three planes that divide the space into $8$ octants, such that each open octant contains at most $1/8$ of the points (respectively, of the mass). In 1966, Hadwiger showed that any mass distribution in $\mathbb{R}^3$ admits an eight-partition; moreover, one can prescribe the normal direction of one of the three planes. The analogous result for finite point sets follows by a standard limit argument. We prove the following variant of this result: Any mass distribution (or point set) in $\mathbb{R}^3$ admits an eight-partition for which the intersection of two of the planes is a line with a prescribed direction. Moreover, we present an efficient algorithm for calculating an eight-partition of a set of $n$ points in~$\mathbb{R}^3$ (with prescribed normal direction of one of the planes) in time $O^{*}(n^{5/2})$.
We study the complexity of approximating the number of answers to a small query $\varphi$ in a large database $\mathcal{D}$. We establish an exhaustive classification into tractable and intractable cases if $\varphi$ is a conjunctive query with disequalities and negations: $\bullet$ If there is a constant bound on the arity of $\varphi$, and if the randomised Exponential Time Hypothesis (rETH) holds, then the problem has a fixed-parameter tractable approximation scheme (FPTRAS) if and only if the treewidth of $\varphi$ is bounded. $\bullet$ If the arity is unbounded and we allow disequalities only, then the problem has an FPTRAS if and only if the adaptive width of $\varphi$ (a width measure strictly more general than treewidth) is bounded; the lower bound relies on the rETH as well. Additionally we show that our results cannot be strengthened to achieve a fully polynomial randomised approximation scheme (FPRAS): We observe that, unless $\mathrm{NP} =\mathrm{RP}$, there is no FPRAS even if the treewidth (and the adaptive width) is $1$. However, if there are neither disequalities nor negations, we prove the existence of an FPRAS for queries of bounded fractional hypertreewidth, strictly generalising the recently established FPRAS for conjunctive queries with bounded hypertreewidth due to Arenas, Croquevielle, Jayaram and Riveros (STOC 2021).
We analyze deep Neural Network emulation rates of smooth functions with point singularities in bounded, polytopal domains $\mathrm{D} \subset \mathbb{R}^d$, $d=2,3$. We prove exponential emulation rates in Sobolev spaces in terms of the number of neurons and in terms of the number of nonzero coefficients for Gevrey-regular solution classes defined in terms of weighted Sobolev scales in $\mathrm{D}$, comprising the countably-normed spaces of I.M. Babu\v{s}ka and B.Q. Guo. As intermediate result, we prove that continuous, piecewise polynomial high order (``$p$-version'') finite elements with elementwise polynomial degree $p\in\mathbb{N}$ on arbitrary, regular, simplicial partitions of polyhedral domains $\mathrm{D} \subset \mathbb{R}^d$, $d\geq 2$ can be exactly emulated by neural networks combining ReLU and ReLU$^2$ activations. On shape-regular, simplicial partitions of polytopal domains $\mathrm{D}$, both the number of neurons and the number of nonzero parameters are proportional to the number of degrees of freedom of the finite element space, in particular for the $hp$-Finite Element Method of I.M. Babu\v{s}ka and B.Q. Guo.
We design a deterministic subexponential time algorithm that takes as input a multivariate polynomial $f$ computed by a constant-depth circuit over rational numbers, and outputs a list $L$ of circuits (of unbounded depth and possibly with division gates) that contains all irreducible factors of $f$ computable by constant-depth circuits. This list $L$ might also include circuits that are spurious: they either do not correspond to factors of $f$ or are not even well-defined, e.g. the input to a division gate is a sub-circuit that computes the identically zero polynomial. The key technical ingredient of our algorithm is a notion of the pseudo-resultant of $f$ and a factor $g$, which serves as a proxy for the resultant of $g$ and $f/g$, with the advantage that the circuit complexity of the pseudo-resultant is comparable to that of the circuit complexity of $f$ and $g$. This notion, which might be of independent interest, together with the recent results of Limaye, Srinivasan and Tavenas, helps us derandomize one key step of multivariate polynomial factorization algorithms - that of deterministically finding a good starting point for Newton Iteration for the case when the input polynomial as well as the irreducible factor of interest have small constant-depth circuits.
The Mapper algorithm is a visualization technique in topological data analysis (TDA) that outputs a graph reflecting the structure of a given dataset. However, the Mapper algorithm requires tuning several parameters in order to generate a ``nice" Mapper graph. This paper focuses on selecting the cover parameter. We present an algorithm that optimizes the cover of a Mapper graph by splitting a cover repeatedly according to a statistical test for normality. Our algorithm is based on $G$-means clustering which searches for the optimal number of clusters in $k$-means by iteratively applying the Anderson-Darling test. Our splitting procedure employs a Gaussian mixture model to carefully choose the cover according to the distribution of the given data. Experiments for synthetic and real-world datasets demonstrate that our algorithm generates covers so that the Mapper graphs retain the essence of the datasets, while also running significantly fast.
It is a longstanding conjecture that every simple drawing of a complete graph on $n \geq 3$ vertices contains a crossing-free Hamiltonian cycle. We strengthen this conjecture to "there exists a crossing-free Hamiltonian path between each pair of vertices" and show that this stronger conjecture holds for several classes of simple drawings, including strongly c-monotone drawings and cylindrical drawings. As a second main contribution, we give an overview on different classes of simple drawings and investigate inclusion relations between them up to weak isomorphism.
For positive integers $d$ and $p$ such that $d \ge p$, we obtain complete asymptotic expansions, for large $d$, of the normalizing constants for the matrix Bingham and matrix Langevin distributions on Stiefel manifolds. The accuracy of each truncated expansion is strictly increasing in $d$; also, for sufficiently large $d$, the accuracy is strictly increasing in $m$, the number of terms in the truncated expansion. We apply these results to obtain the rate of convergence of these asymptotic expansions if both $d, p \to \infty$. Using values of $d$ and $p$ arising in various data sets, we illustrate the rate of convergence of the truncated approximations as $d$ or $m$ increases. These results extend our recent work on asymptotic expansions for the normalizing constants of the high-dimensional Bingham distributions.
It is well known that any graph admits a crossing-free straight-line drawing in $\mathbb{R}^3$ and that any planar graph admits the same even in $\mathbb{R}^2$. For a graph $G$ and $d \in \{2,3\}$, let $\rho^1_d(G)$ denote the smallest number of lines in $\mathbb{R}^d$ whose union contains a crossing-free straight-line drawing of $G$. For $d=2$, $G$ must be planar. Similarly, let $\rho^2_3(G)$ denote the smallest number of planes in $\mathbb{R}^3$ whose union contains a crossing-free straight-line drawing of $G$. We investigate the complexity of computing these three parameters and obtain the following hardness and algorithmic results. - For $d\in\{2,3\}$, we prove that deciding whether $\rho^1_d(G)\le k$ for a given graph $G$ and integer $k$ is ${\exists\mathbb{R}}$-complete. - Since $\mathrm{NP}\subseteq{\exists\mathbb{R}}$, deciding $\rho^1_d(G)\le k$ is NP-hard for $d\in\{2,3\}$. On the positive side, we show that the problem is fixed-parameter tractable with respect to $k$. - Since ${\exists\mathbb{R}}\subseteq\mathrm{PSPACE}$, both $\rho^1_2(G)$ and $\rho^1_3(G)$ are computable in polynomial space. On the negative side, we show that drawings that are optimal with respect to $\rho^1_2$ or $\rho^1_3$ sometimes require irrational coordinates. - We prove that deciding whether $\rho^2_3(G)\le k$ is NP-hard for any fixed $k \ge 2$. Hence, the problem is not fixed-parameter tractable with respect to $k$ unless $\mathrm{P}=\mathrm{NP}$.
Let $(P,E)$ be a $(d+1)$-uniform geometric hypergraph, where $P$ is an $n$-point set in general position in $\mathbb{R}^d$ and $E\subseteq {P\choose d+1}$ is a collection of $\epsilon{n\choose d+1}$ $d$-dimensional simplices with vertices in $P$, for $0<\epsilon\leq 1$. We show that there is a point $x\in {\mathbb R}^d$ that pierces $\displaystyle \Omega\left(\epsilon^{(d^4+d)(d+1)+\delta}{n\choose d+1}\right)$ simplices in $E$, for any fixed $\delta>0$. This is a dramatic improvement in all dimensions $d\geq 3$, over the previous lower bounds of the general form $\displaystyle \epsilon^{(cd)^{d+1}}n^{d+1}$, which date back to the seminal 1991 work of Alon, B\'{a}r\'{a}ny, F\"{u}redi and Kleitman. As a result, any $n$-point set in general position in $\mathbb{R}^d$ admits only $\displaystyle O\left(n^{d-\frac{1}{d(d-1)^4+d(d-1)}+\delta}\right)$ halving hyperplanes, for any $\delta>0$, which is a significant improvement over the previously best known bound $\displaystyle O\left(n^{d-\frac{1}{(2d)^{d}}}\right)$ in all dimensions $d\geq 5$. An essential ingredient of our proof is the following semi-algebraic Tur\'an-type result of independent interest: Let $(V_1,\ldots,V_k,E)$ be a hypergraph of bounded semi-algebraic description complexity in ${\mathbb R}^d$ that satisfies $|E|\geq \varepsilon |V_1|\cdot\ldots \cdot |V_k|$ for some $\varepsilon>0$. Then there exist subsets $W_i\subseteq V_i$ that satisfy $W_1\times W_2\times\ldots\times W_k\subseteq E$, and $|W_1|\cdot\ldots\cdots|W_k|=\Omega\left(\varepsilon^{d(k-1)+1}|V_1|\cdot |V_2|\cdot\ldots\cdot|V_k|\right)$.