Given a set $P$ of $n$ points and a set $S$ of $m$ disks in the plane, the disk coverage problem asks for a smallest subset of disks that together cover all points of $P$. The problem is NP-hard. In this paper, we consider a line-separable unit-disk version of the problem where all disks have the same radius and their centers are separated from the points of $P$ by a line $\ell$. We present an $m^{2/3}n^{2/3}2^{O(\log^*(m+n))} + O((n+m)\log (n+m))$ time algorithm for the problem. This improves the previously best result of $O(nm+ n\log n)$ time. Our techniques also solve the line-constrained version of the problem, where centers of all disks of $S$ are located on a line $\ell$ while points of $P$ can be anywhere in the plane. Our algorithm runs in $O(m\sqrt{n} + (n+m)\log(n+m))$ time, which improves the previously best result of $O(nm\log(m+n))$ time. In addition, our results lead to an algorithm of $n^{10/3}2^{O(\log^*n)}$ time for a half-plane coverage problem (given $n$ half-planes and $n$ points, find a smallest subset of half-planes covering all points); this improves the previously best algorithm of $O(n^4\log n)$ time. Further, if all half-planes are lower ones, our algorithm runs in $n^{4/3}2^{O(\log^*n)}$ time while the previously best algorithm takes $O(n^2\log n)$ time.
Posterior sampling, i.e., exponential mechanism to sample from the posterior distribution, provides $\varepsilon$-pure differential privacy (DP) guarantees and does not suffer from potentially unbounded privacy breach introduced by $(\varepsilon,\delta)$-approximate DP. In practice, however, one needs to apply approximate sampling methods such as Markov chain Monte Carlo (MCMC), thus re-introducing the unappealing $\delta$-approximation error into the privacy guarantees. To bridge this gap, we propose the Approximate SAample Perturbation (abbr. ASAP) algorithm which perturbs an MCMC sample with noise proportional to its Wasserstein-infinity ($W_\infty$) distance from a reference distribution that satisfies pure DP or pure Gaussian DP (i.e., $\delta=0$). We then leverage a Metropolis-Hastings algorithm to generate the sample and prove that the algorithm converges in W$_\infty$ distance. We show that by combining our new techniques with a careful localization step, we obtain the first nearly linear-time algorithm that achieves the optimal rates in the DP-ERM problem with strongly convex and smooth losses.
A cubic hypermatrix of order $d$ can be considered as a structure matrix of a tensor with covariant order $r$ and contra-variant order $s=d-r$. Corresponding to this matrix expression of the hypermatrix, an eigenvector $x$ with respect to an eigenvalue $\lambda$ is proposed, called the universal eigenvector and eigenvalue of the hypermatrix. According to the action of tensors, if $x$ is decomposable, it is called a universal hyper-(UH-)eigenvector. Particularly, if all decomposed components are the same, $x$ is called a universal diagonal hyper (UDH-)eigenvector, which covers most of existing definitions of eigenvalue/eigenvector of hypermatrices. Using Semi-tensor product (STP) of matrices, the properties of universal eigenvalues/eigenvectors are investigated. Algorithms are developed to calculate universal eigenvalues/eigenvectors for hypermatrices. Particular efforts have been put on UDH- eigenvalues/eigenvectors, because they cover most of the existing eigenvalues/eigenvectors for hypermatrices. Some numerical examples are presented to illustrate that the proposed technique is universal and efficient.
Constructing a similarity graph from a set $X$ of data points in $\mathbb{R}^d$ is the first step of many modern clustering algorithms. However, typical constructions of a similarity graph have high time complexity, and a quadratic space dependency with respect to $|X|$. We address this limitation and present a new algorithmic framework that constructs a sparse approximation of the fully connected similarity graph while preserving its cluster structure. Our presented algorithm is based on the kernel density estimation problem, and is applicable for arbitrary kernel functions. We compare our designed algorithm with the well-known implementations from the scikit-learn library and the FAISS library, and find that our method significantly outperforms the implementation from both libraries on a variety of datasets.
We consider the distributionally robust optimization (DRO) problem with spectral risk-based uncertainty set and $f$-divergence penalty. This formulation includes common risk-sensitive learning objectives such as regularized condition value-at-risk (CVaR) and average top-$k$ loss. We present Prospect, a stochastic gradient-based algorithm that only requires tuning a single learning rate hyperparameter, and prove that it enjoys linear convergence for smooth regularized losses. This contrasts with previous algorithms that either require tuning multiple hyperparameters or potentially fail to converge due to biased gradient estimates or inadequate regularization. Empirically, we show that Prospect can converge 2-3$\times$ faster than baselines such as stochastic gradient and stochastic saddle-point methods on distribution shift and fairness benchmarks spanning tabular, vision, and language domains.
A new approach to analyzing intrinsic properties of the Josephus function, $J_{_k}$, is presented in this paper. The linear structure between extreme points of $J_{_k}$ is fully revealed, leading to the design of an efficient algorithm for evaluating $J_{_k}(n)$. Algebraic expressions that describe how recursively compute extreme points, including fixed points, are derived. The existence of consecutive extreme and also fixed points for all $k\geq 2$ is proven as a consequence, which generalizes Knuth result for $k=2$. Moreover, an extensive comparative numerical experiment is conducted to illustrate the performance of the proposed algorithm for evaluating the Josephus function compared to established algorithms. The results show that the proposed scheme is highly effective in computing $J_{_k}(n)$ for large inputs.
We study the classic Text-to-Pattern Hamming Distances problem: given a pattern $P$ of length $m$ and a text $T$ of length $n$, both over a polynomial-size alphabet, compute the Hamming distance between $P$ and $T[i\, .\, . \, i+m-1]$ for every shift $i$, under the standard Word-RAM model with $\Theta(\log n)$-bit words. - We provide an $O(n\sqrt{m})$ time Las Vegas randomized algorithm for this problem, beating the decades-old $O(n \sqrt{m \log m})$ running time [Abrahamson, SICOMP 1987]. We also obtain a deterministic algorithm, with a slightly higher $O(n\sqrt{m}(\log m\log\log m)^{1/4})$ running time. Our randomized algorithm extends to the $k$-bounded setting, with running time $O\big(n+\frac{nk}{\sqrt{m}}\big)$, removing all the extra logarithmic factors from earlier algorithms [Gawrychowski and Uzna\'{n}ski, ICALP 2018; Chan, Golan, Kociumaka, Kopelowitz and Porat, STOC 2020]. - For the $(1+\epsilon)$-approximate version of Text-to-Pattern Hamming Distances, we give an $\tilde{O}(\epsilon^{-0.93}n)$ time Monte Carlo randomized algorithm, beating the previous $\tilde{O}(\epsilon^{-1}n)$ running time [Kopelowitz and Porat, FOCS 2015; Kopelowitz and Porat, SOSA 2018]. Our approximation algorithm exploits a connection with $3$SUM, and uses a combination of Fredman's trick, equality matrix product, and random sampling; in particular, we obtain new results on approximate counting versions of $3$SUM and Exact Triangle, which may be of independent interest. Our exact algorithms use a novel combination of hashing, bit-packed FFT, and recursion; in particular, we obtain a faster algorithm for computing the sumset of two integer sets, in the regime when the universe size is close to quadratic in the number of elements. We also prove a fine-grained equivalence between the exact Text-to-Pattern Hamming Distances problem and a range-restricted, counting version of $3$SUM.
A universal partial cycle (or upcycle) for $\mathcal{A}^n$ is a cyclic sequence that covers each word of length $n$ over the alphabet $\mathcal{A}$ exactly once -- like a De Bruijn cycle, except that we also allow a wildcard symbol $\mathord{\diamond}$ that can represent any letter of $\mathcal{A}$. Chen et al. in 2017 and Goeckner et al. in 2018 showed that the existence and structure of upcycles are highly constrained, unlike those of De Bruijn cycles, which exist for any alphabet size and word length. Moreover, it was not known whether any upcycles existed for $n \ge 5$. We present several examples of upcycles over both binary and non-binary alphabets for $n = 8$. We generalize two graph-theoretic representations of De Bruijn cycles to upcycles. We then introduce novel approaches to constructing new upcycles from old ones. Notably, given any upcycle for an alphabet of size $a$, we show how to construct an upcycle for an alphabet of size $ak$ for any $k \in \mathbb{N}$, so each example generates an infinite family of upcycles. We also define folds and lifts of upcycles, which relate upcycles with differing densities of $\mathord{\diamond}$ characters. In particular, we show that every upcycle lifts to a De Bruijn cycle. Our constructions rely on a different generalization of De Bruijn cycles known as perfect necklaces, and we introduce several new examples of perfect necklaces. We extend the definitions of certain pseudorandomness properties to partial words and determine which are satisfied by all upcycles, then draw a conclusion about linear feedback shift registers. Finally, we prove new nonexistence results based on the word length $n$, alphabet size, and $\mathord{\diamond}$ density.
For a fixed set ${\cal H}$ of graphs, a graph $G$ is ${\cal H}$-subgraph-free if $G$ does not contain any $H \in {\cal H}$ as a (not necessarily induced) subgraph. A recently proposed framework gives a complete classification on ${\cal H}$-subgraph-free graphs (for finite sets ${\cal H}$) for problems that are solvable in polynomial time on graph classes of bounded treewidth, NP-complete on subcubic graphs, and whose NP-hardness is preserved under edge subdivision. While a lot of problems satisfy these conditions, there are also many problems that do not satisfy all three conditions and for which the complexity ${\cal H}$-subgraph-free graphs is unknown. In this paper, we study problems for which only the first two conditions of the framework hold (they are solvable in polynomial time on classes of bounded treewidth and NP-complete on subcubic graphs, but NP-hardness is not preserved under edge subdivision). In particular, we make inroads into the classification of the complexity of four such problems: $k$-Induced Disjoint Paths, $C_5$-Colouring, Hamilton Cycle and Star $3$-Colouring. Although we do not complete the classifications, we show that the boundary between polynomial time and NP-complete differs among our problems and differs from problems that do satisfy all three conditions of the framework. Hence, we exhibit a rich complexity landscape among problems for ${\cal H}$-subgraph-free graph classes.
For an undirected unweighted graph $G=(V,E)$ with $n$ vertices and $m$ edges, let $d(u,v)$ denote the distance from $u\in V$ to $v\in V$ in $G$. An $(\alpha,\beta)$-stretch approximate distance oracle (ADO) for $G$ is a data structure that given $u,v\in V$ returns in constant (or near constant) time a value $\hat d (u,v)$ such that $d(u,v) \le \hat d (u,v) \le \alpha\cdot d(u,v) + \beta$, for some reals $\alpha >1, \beta$. If $\beta = 0$, we say that the ADO has stretch $\alpha$. Thorup and Zwick~\cite{thorup2005approximate} showed that one cannot beat stretch 3 with subquadratic space (in terms of $n$) for general graphs. P\v{a}tra\c{s}cu and Roditty~\cite{patrascu2010distance} showed that one can obtain stretch 2 using $O(m^{1/3}n^{4/3})$ space, and so if $m$ is subquadratic in $n$ then the space usage is also subquadratic. Moreover, P\v{a}tra\c{s}cu and Roditty~\cite{patrascu2010distance} showed that one cannot beat stretch 2 with subquadratic space even for graphs where $m=\tilde{O}(n)$, based on the set-intersection hypothesis. In this paper we explore the conditions for which an ADO can be stored using subquadratic space while supporting a sub-2 stretch. In particular, we show that if the maximum degree in $G$ is $\Delta_G \leq O(n^{1/2-\varepsilon})$ for some $0<\varepsilon \leq 1/2$, then there exists an ADO for $G$ that uses $\tilde{O}(n^{2-\frac {2\varepsilon}{3}})$ space and has a sub-2 stretch. Moreover, we prove a conditional lower bound, based on the set intersection hypothesis, which states that for any positive integer $k \leq \log n$, obtaining a sub-$\frac{k+2}{k}$ stretch for graphs with maximum degree $\Theta(n^{1/k})$ requires quadratic space. Thus, for graphs with maximum degree $\Theta(n^{1/2})$, obtaining a sub-2 stretch requires quadratic space.
Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.