In the distributional Twenty Questions game, Bob chooses a number $x$ from $1$ to $n$ according to a distribution $\mu$, and Alice (who knows $\mu$) attempts to identify $x$ using Yes/No questions, which Bob answers truthfully. Her goal is to minimize the expected number of questions. The optimal strategy for the Twenty Questions game corresponds to a Huffman code for $\mu$, yet this strategy could potentially uses all $2^n$ possible questions. Dagan et al. constructed a set of $1.25^{n+o(n)}$ questions which suffice to construct an optimal strategy for all $\mu$, and showed that this number is optimal (up to sub-exponential factors) for infinitely many $n$. We determine the optimal size of such a set of questions for all $n$ (up to sub-exponential factors), answering an open question of Dagan et al. In addition, we generalize the results of Dagan et al. to the $d$-ary setting, obtaining similar results with $1.25$ replaced by $1 + (d-1)/d^{d/(d-1)}$.
Let $S$ be a set of $n$ sites in the plane, so that every site $s \in S$ has an associated radius $r_s > 0$. Let $\mathcal{D}(S)$ be the disk intersection graph defined by $S$, i.e., the graph with vertex set $S$ and an edge between two distinct sites $s, t \in S$ if and only if the disks with centers $s$, $t$ and radii $r_s$, $r_t$ intersect.Our goal is to design data structures that maintain the connectivity structure of $\mathcal{D}(S)$ as sites are inserted and/or deleted in $S$.
Let $G$ be an intersection graph of $n$ geometric objects in the plane. We show that a maximum matching in $G$ can be found in $O(\rho^{3\omega/2}n^{\omega/2})$ time with high probability, where $\rho$ is the density of the geometric objects and $\omega>2$ is a constant such that $n \times n$ matrices can be multiplied in $O(n^\omega)$ time. The same result holds for any subgraph of $G$, as long as a geometric representation is at hand. For this, we combine algebraic methods, namely computing the rank of a matrix via Gaussian elimination, with the fact that geometric intersection graphs have small separators. We also show that in many interesting cases, the maximum matching problem in a general geometric intersection graph can be reduced to the case of bounded density. In particular, a maximum matching in the intersection graph of any family of translates of a convex object in the plane can be found in $O(n^{\omega/2})$ time with high probability, and a maximum matching in the intersection graph of a family of planar disks with radii in $[1, \Psi]$ can be found in $O(\Psi^6\log^{11} n + \Psi^{12 \omega} n^{\omega/2})$ time with high probability.
In this paper, the construction of $C^{1}$ cubic quasi-interpolants on a three-direction mesh of $\RR^{2}$ is addressed. The quasi-interpolating splines are defined by directly setting their Bernstein-B\'{e}zier coefficients relative to each triangle from point and gradient values in order to reproduce the polynomials of the highest possible degree. Moreover, additional global properties are required. Finally, we provide some numerical tests confirming the approximation properties.
We study the complexity of a fundamental algorithm for fairly allocating indivisible items, the round-robin algorithm. For $n$ agents and $m$ items, we show that the algorithm can be implemented in time $O(nm\log(m/n))$ in the worst case. If the agents' preferences are uniformly random, we establish an improved (expected) running time of $O(nm + m\log m)$. On the other hand, assuming comparison queries between items, we prove that $\Omega(nm + m\log m)$ queries are necessary to implement the algorithm, even when randomization is allowed. We also derive bounds in noise models where the answers to queries are incorrect with some probability. Our proofs involve novel applications of tools from multi-armed bandit, information theory, as well as posets and linear extensions.
In this paper, the key objects of interest are the sequential covariance matrices $\mathbf{S}_{n,t}$ and their largest eigenvalues. Here, the matrix $\mathbf{S}_{n,t}$ is computed as the empirical covariance associated with observations $\{\mathbf{x}_1,\ldots,\mathbf{x}_{ \lfloor nt \rfloor } \}$, for $t\in [0,1]$. The observations $\mathbf{x}_1,\ldots,\mathbf{x}_n$ are assumed to be i.i.d. $p$-dimensional vectors with zero mean, and a covariance matrix that is a fixed-rank perturbation of the identity matrix. Treating $\{ \mathbf{S}_{n,t}\}_{t \in [0,1]}$ as a matrix-valued stochastic process indexed by $t$, we study the behavior of the largest eigenvalues of $\mathbf{S}_{n,t}$, as $t$ varies, with $n$ and $p$ increasing simultaneously, so that $p/n \to y \in (0,1)$. As a key contribution of this work, we establish the weak convergence of the stochastic process corresponding to the sample spiked eigenvalues, if their population counterparts exceed the critical phase-transition threshold. Our analysis of the limiting process is fully comprehensive revealing, in general, non-Gaussian limiting processes. As an application, we consider a class of change-point problems, where the interest is in detecting structural breaks in the covariance caused by a change in magnitude of the spiked eigenvalues. For this purpose, we propose two different maximal statistics corresponding to centered spiked eigenvalues of the sequential covariances. We show the existence of limiting null distributions for these statistics, and prove consistency of the test under fixed alternatives. Moreover, we compare the behavior of the proposed tests through a simulation study.
We study a fundamental problem in Computational Geometry, the planar two-center problem. In this problem, the input is a set $S$ of $n$ points in the plane and the goal is to find two smallest congruent disks whose union contains all points of $S$. A longstanding open problem has been to obtain an $O(n\log n)$-time algorithm for planar two-center, matching the $\Omega(n\log n)$ lower bound given by Eppstein [SODA'97]. Towards this, researchers have made a lot of efforts over decades. The previous best algorithm, given by Wang [SoCG'20], solves the problem in $O(n\log^2 n)$ time. In this paper, we present an $O(n\log n)$-time (deterministic) algorithm for planar two-center, which completely resolves this open problem.
A minimal perfect hash function (MPHF) maps a set of n keys to {1, ..., n} without collisions. Such functions find widespread application e.g. in bioinformatics and databases. In this paper we revisit PTHash - a construction technique particularly designed for fast queries. PTHash distributes the input keys into small buckets and, for each bucket, it searches for a hash function seed that places its keys in the output domain without collisions. The collection of all seeds is then stored in a compressed way. Since the first buckets are easier to place, buckets are considered in non-increasing order of size. Additionally, PTHash heuristically produces an imbalanced distribution of bucket sizes by distributing 60% of the keys into 30% of the buckets. Our main contribution is to characterize, up to lower order terms, an optimal distribution of expected bucket sizes. We arrive at a simple, closed form solution which improves construction throughput for space efficient configurations in practice. Our second contribution is a novel encoding scheme for the seeds. We split the keys into partitions. Within each partition, we run the bucket distribution and search step. We then store the seeds in an interleaved way by consecutively placing the seeds for the i-th buckets from all partitions. The seeds for the i-th bucket of each partition follow the same statistical distribution. This allows us to tune a compressor for each bucket. Hence, we call our technique PHOBIC - Perfect Hashing with Optimized Bucket sizes and Interleaved Coding. Compared to PTHash, PHOBIC is 0.17 bits/key more space efficient for same query time and construction throughput. We also contribute a GPU implementation to further accelerate MPHF construction. For a configuration with fast queries, PHOBIC-GPU can construct a perfect hash function at 2.17 bits/key in 28 ns per key, which can be queried in 37 ns on the CPU.
A dominating set of a graph $G=(V,E)$ is a subset of vertices $S\subseteq V$ such that every vertex $v\in V\setminus S$ has at least one neighbor in set $S$. The corresponding optimization problem is known to be NP-hard. The best known polynomial time approximation algorithm for the problem separates the solution process in two stages applying first a fast greedy algorithm to obtain an initial dominating set, and then it uses an iterative procedure to reduce (purify) this dominating set. The purification stage turned out to be practically efficient. Here we further strengthen the purification stage presenting four new purification algorithms. All four purification procedures outperform the earlier purification procedure. The algorithms were tested for over 1300 benchmark problem instances. Compared to the known upper bounds, the obtained solutions were about 7 times better. Remarkably, for the 500 benchmark instances for which the optimum is known, the optimal solutions were obtained for 46.33\% of the tested instances, whereas the average error for the remaining instances was about 1.01.
Let $S$ be a set of $n$ points in general position in $\mathbb{R}^d$. The order-$k$ Voronoi diagram of $S$, $V_k(S)$, is a subdivision of $\mathbb{R}^d$ into cells whose points have the same $k$ nearest points of $S$. Sibson, in his seminal paper from 1980 (A vector identity for the Dirichlet tessellation), gives a formula to express a point $Q$ of $S$ as a convex combination of other points of $S$ by using ratios of volumes of the intersection of cells of $V_2(S)$ and the cell of $Q$ in $V_1(S)$. The natural neighbour interpolation method is based on Sibson's formula. We generalize his result to express $Q$ as a convex combination of other points of $S$ by using ratios of volumes from Voronoi diagrams of any given order.
We propose an exact algorithm for the Graph Burning Problem ($\texttt{GBP}$), an NP-hard optimization problem that models the spread of influence on social networks. Given a graph $G$ with vertex set $V$, the objective is to find a sequence of $k$ vertices in $V$, namely, $v_1, v_2, \dots, v_k$, such that $k$ is minimum and $\bigcup_{i = 1}^{k} \{u\! \in\! V\! : d(u, v_i) \leq k - i\} = V$, where $d(u,v)$ denotes the distance between $u$ and $v$. We formulate the problem as a set covering integer programming model and design a row generation algorithm for the $\texttt{GBP}$. Our method exploits the fact that a very small number of covering constraints is often sufficient for solving the integer model, allowing the corresponding rows to be generated on demand. To date, the most efficient exact algorithm for the $\texttt{GBP}$, denoted here by $\texttt{GDCA}$, is able to obtain optimal solutions for graphs with up to 14,000 vertices within two hours of execution. In comparison, our algorithm finds provably optimal solutions approximately 236 times faster, on average, than $\texttt{GDCA}$. For larger graphs, memory space becomes a limiting factor for $\texttt{GDCA}$. Our algorithm, however, solves real-world instances with almost 200,000 vertices in less than 35 seconds, increasing the size of graphs for which optimal solutions are known by a factor of 14.