We prove that for any integers $\alpha, \beta > 1$, the existential fragment of the first-order theory of the structure $\langle \mathbb{Z}; 0,1,<, +, \alpha^{\mathbb{N}}, \beta^{\mathbb{N}}\rangle$ is decidable (where $\alpha^{\mathbb{N}}$ is the set of positive integer powers of $\alpha$, and likewise for $\beta^{\mathbb{N}}$). On the other hand, we show by way of hardness that decidability of the existential fragment of the theory of $\langle \mathbb{N}; 0,1, <, +, x\mapsto \alpha^x, x \mapsto \beta^x\rangle$ for any multiplicatively independent $\alpha,\beta > 1$ would lead to mathematical breakthroughs regarding base-$\alpha$ and base-$\beta$ expansions of certain transcendental numbers.
We provide a perfect sampling algorithm for the hard-sphere model on subsets of $\mathbb{R}^d$ with expected running time linear in the volume under the assumption of strong spatial mixing. A large number of perfect and approximate sampling algorithms have been devised to sample from the hard-sphere model, and our perfect sampling algorithm is efficient for a range of parameters for which only efficient approximate samplers were previously known and is faster than these known approximate approaches. Our methods also extend to the more general setting of Gibbs point processes interacting via finite-range, repulsive potentials.
We show that every Borel graph $G$ of subexponential growth has a Borel proper edge-coloring with $\Delta(G) + 1$ colors. We deduce this from a stronger result, namely that an $n$-vertex (finite) graph $G$ of subexponential growth can be properly edge-colored using $\Delta(G) + 1$ colors by an $O(\log^\ast n)$-round deterministic distributed algorithm in the $\mathsf{LOCAL}$ model, where the implied constants in the $O(\cdot)$ notation are determined by a bound on the growth rate of $G$.
Consider that there are $k\le n$ agents in a simple, connected, and undirected graph $G=(V,E)$ with $n$ nodes and $m$ edges. The goal of the dispersion problem is to move these $k$ agents to mutually distinct nodes. Agents can communicate only when they are at the same node, and no other communication means, such as whiteboards, are available. We assume that the agents operate synchronously. We consider two scenarios: when all agents are initially located at a single node (rooted setting) and when they are initially distributed over one or more nodes (general setting). Kshemkalyani and Sharma presented a dispersion algorithm for the general setting, which uses $O(m_k)$ time and $\log(k + \Delta)$ bits of memory per agent [OPODIS 2021], where $m_k$ is the maximum number of edges in any induced subgraph of $G$ with $k$ nodes, and $\Delta$ is the maximum degree of $G$. This algorithm is currently the fastest in the literature, as no $o(m_k)$-time algorithm has been discovered, even for the rooted setting. In this paper, we present significantly faster algorithms for both the rooted and the general settings. First, we present an algorithm for the rooted setting that solves the dispersion problem in $O(k\log \min(k,\Delta))=O(k\log k)$ time using $O(\log (k+\Delta))$ bits of memory per agent. Next, we propose an algorithm for the general setting that achieves dispersion in $O(k \log k \cdot \log \min(k,\Delta))=O(k \log^2 k)$ time using $O(\log (k+\Delta))$ bits. Finally, for the rooted setting, we give a time-optimal (i.e.,~$O(k)$-time) algorithm with $O(\Delta+\log k)$ bits of space per agent. All algorithms presented in this paper work only in the synchronous setting, while several algorithms in the literature, including the one given by Kshemkalyani and Sharma at OPODIS 2021, work in the asynchronous setting.
For the problem of maximizing a monotone, submodular function with respect to a cardinality constraint $k$ on a ground set of size $n$, we provide an algorithm that achieves the state-of-the-art in both its empirical performance and its theoretical properties, in terms of adaptive complexity, query complexity, and approximation ratio; that is, it obtains, with high probability, query complexity of $O(n)$ in expectation, adaptivity of $O(\log(n))$, and approximation ratio of nearly $1-1/e$. The main algorithm is assembled from two components which may be of independent interest. The first component of our algorithm, LINEARSEQ, is useful as a preprocessing algorithm to improve the query complexity of many algorithms. Moreover, a variant of LINEARSEQ is shown to have adaptive complexity of $O( \log (n / k) )$ which is smaller than that of any previous algorithm in the literature. The second component is a parallelizable thresholding procedure THRESHOLDSEQ for adding elements with gain above a constant threshold. Finally, we demonstrate that our main algorithm empirically outperforms, in terms of runtime, adaptive rounds, total queries, and objective values, the previous state-of-the-art algorithm FAST in a comprehensive evaluation with six submodular objective functions.
We give a comprehensive description of Wasserstein gradient flows of maximum mean discrepancy (MMD) functionals $\mathcal F_\nu := \text{MMD}_K^2(\cdot, \nu)$ towards given target measures $\nu$ on the real line, where we focus on the negative distance kernel $K(x,y) := -|x-y|$. In one dimension, the Wasserstein-2 space can be isometrically embedded into the cone $\mathcal C(0,1) \subset L_2(0,1)$ of quantile functions leading to a characterization of Wasserstein gradient flows via the solution of an associated Cauchy problem on $L_2(0,1)$. Based on the construction of an appropriate counterpart of $\mathcal F_\nu$ on $L_2(0,1)$ and its subdifferential, we provide a solution of the Cauchy problem. For discrete target measures $\nu$, this results in a piecewise linear solution formula. We prove invariance and smoothing properties of the flow on subsets of $\mathcal C(0,1)$. For certain $\mathcal F_\nu$-flows this implies that initial point measures instantly become absolutely continuous, and stay so over time. Finally, we illustrate the behavior of the flow by various numerical examples using an implicit Euler scheme and demonstrate differences to the explicit Euler scheme, which is easier to compute, but comes with limited convergence guarantees.
We consider the problem of linearizing a pseudo-Boolean function $f : \{0,1\}^n \to \mathbb{R}$ by means of $k$ Boolean functions. Such a linearization yields an integer linear programming formulation with only $k$ auxiliary variables. This motivates the definition of the linarization complexity of $f$ as the minimum such $k$. Our theoretical contributions are the proof that random polynomials almost surely have a high linearization complexity and characterizations of its value in case we do or do not restrict the set of admissible Boolean functions. The practical relevance is shown by devising and evaluating integer linear programming models of two such linearizations for the low auto-correlation binary sequences problem. Still, many problems around this new concept remain open.
Given a pointed metric space $(X,\mathsf{dist}, w)$ on $n$ points, its Gromov's approximating tree is a 0-hyperbolic pseudo-metric space $(X,\mathsf{dist}_T)$ such that $\mathsf{dist}(x,w)=\mathsf{dist}_T(x,w)$ and $\mathsf{dist}(x, y)-2 \delta \log_2n \leq \mathsf{dist}_T (x, y) \leq \mathsf{dist}(x, y)$ for all $x, y \in X$ where $\delta$ is the Gromov hyperbolicity of $X$. On the other hand, the all pairs bottleneck paths (APBP) problem asks, given an undirected graph with some capacities on its edges, to find the maximal path capacity between each pair of vertices. In this note, we prove: $\bullet$ Computing Gromov's approximating tree for a metric space with $n+1$ points from its matrix of distances reduces to solving the APBP problem on an connected graph with $n$ vertices. $\bullet$ There is an explicit algorithm that computes Gromov's approximating tree for a graph from its adjacency matrix in quadratic time. $\bullet$ Solving the APBP problem on a weighted graph with $n$ vertices reduces to finding Gromov's approximating tree for a metric space with $n+1$ points from its distance matrix.
We study the problem of searching for a target at some unknown location in $\mathbb{R}^d$ when additional information regarding the position of the target is available in the form of predictions. In our setting, predictions come as approximate distances to the target: for each point $p\in \mathbb{R}^d$ that the searcher visits, we obtain a value $\lambda(p)$ such that $|p\mathbf{t}|\le \lambda(p) \le c\cdot |p\mathbf{t}|$, where $c\ge 1$ is a fixed constant, $\mathbf{t}$ is the position of the target, and $|p\mathbf{t}|$ is the Euclidean distance of $p$ to $\mathbf{t}$. The cost of the search is the length of the path followed by the searcher. Our main positive result is a strategy that achieves $(12c)^{d+1}$-competitive ratio, even when the constant $c$ is unknown. We also give a lower bound of roughly $(c/16)^{d-1}$ on the competitive ratio of any search strategy in $\mathbb{R}^d$.
The large-matrix limit laws of the rescaled largest eigenvalue of the orthogonal, unitary and symplectic $n$-dimensional Gaussian ensembles -- and of the corresponding Laguerre ensembles (Wishart distributions) for various regimes of the parameter $\alpha$ (degrees of freedom $p$) -- are known to be the Tracy-Widom distributions $F_\beta$ ($\beta=1,2,4$). We will establish (paying particular attention to large, or small, ratios $p/n$) that, with careful choices of the rescaling constants and the expansion parameter $h$, the limit laws embed into asymptotic expansions in powers of $h$, where $h \asymp n^{-2/3}$ resp. $h \asymp (n\,\wedge\,p)^{-2/3}$. We find explicit analytic expressions of the first few expansions terms as linear combinations, with rational polynomial coefficients, of higher order derivatives of the limit law $F_\beta$. With a proper parametrization, the expansions in the Gaussian cases can be understood, for given $n$, as the limit $p\to\infty$ of the Laguerre cases. Whereas the results for $\beta=2$ are presented with proof, the discussion of the cases $\beta=1,4$ is based on some hypotheses, focussing on the algebraic aspects of actually computing the polynomial coefficients. For the purposes of illustration and validation, the various results are checked against simulation data with large sample sizes.
The bad science matrix problem consists in finding, among all matrices $A \in \mathbb{R}^{n \times n}$ with rows having unit $\ell^2$ norm, one that maximizes $\beta(A) = \frac{1}{2^n} \sum_{x \in \{-1, 1\}^n} \|Ax\|_\infty$. Our main contribution is an explicit construction of an $n \times n$ matrix $A$ showing that $\beta(A) \geq \sqrt{\log_2(n+1)}$, which is only 18% smaller than the asymptotic rate. We prove that every entry of any optimal matrix is a square root of a rational number, and we find provably optimal matrices for $n \leq 4$.