Given a set $P$ of $n$ weighted points and a set $S$ of $m$ disks in the plane, the hitting set problem is to compute a subset $P'$ of points of $P$ such that each disk contains at least one point of $P'$ and the total weight of all points of $P'$ is minimized. The problem is known to be NP-hard. In this paper, we consider a line-constrained version of the problem in which all disks are centered on a line $\ell$. We present an $O((m+n)\log(m+n)+\kappa \log m)$ time algorithm for the problem, where $\kappa$ is the number of pairs of disks that intersect. For the unit-disk case where all disks have the same radius, the running time can be reduced to $O((n + m)\log(m + n))$. In addition, we solve the problem in $O((m + n)\log(m + n))$ time in the $L_{\infty}$ and $L_1$ metrics, in which a disk is a square and a diamond, respectively. Our techniques can also be used to solve other geometric hitting set problems. For example, given in the plane a set $P$ of $n$ weighted points and a set $S$ of $n$ half-planes, we solve in $O(n^4\log n)$ time the problem of finding a minimum weight hitting set of $P$ for $S$. This improves the previous best algorithm of $O(n^6)$ time by nearly a quadratic factor.
Randomized subspace approximation with "matrix sketching" is an effective approach for constructing approximate partial singular value decompositions (SVDs) of large matrices. The performance of such techniques has been extensively analyzed, and very precise estimates on the distribution of the residual errors have been derived. However, our understanding of the accuracy of the computed singular vectors (measured in terms of the canonical angles between the spaces spanned by the exact and the computed singular vectors, respectively) remains relatively limited. In this work, we present bounds and estimates for canonical angles of randomized subspace approximation that can be computed efficiently either a priori or a posterior. Under moderate oversampling in the randomized SVD, our prior probabilistic bounds are asymptotically tight and can be computed efficiently, while bringing a clear insight into the balance between oversampling and power iterations given a fixed budget on the number of matrix-vector multiplications. The numerical experiments demonstrate the empirical effectiveness of these canonical angle bounds and estimates on different matrices under various algorithmic choices for the randomized SVD.
Many algorithms which exactly solve hard problems require branching on more or less complex structures in order to do their job. Those who design such algorithms often find themselves doing a meticulous analysis of numerous different cases in order to identify these structures and design suitable branching rules, all done by hand. This process tends to be error prone and often the resulting algorithm may be difficult to implement in practice. In this work, we aim to automate a part of this process and focus on simplicity of the resulting implementation. We showcase our approach on the following problem. For a constant $d$, the $d$-Path Vertex Cover problem ($d$-PVC) is as follows: Given an undirected graph and an integer $k$, find a subset of at most $k$ vertices of the graph, such that their deletion results in a graph not containing a path on $d$ vertices as a subgraph. We develop a fully automated framework to generate parameterized branching algorithms for the problem and obtain algorithms outperforming those previously known for $3 \le d \le 8$. E.g., we show that $5$-PVC can be solved in $O(2.7^k\cdot n^{O(1)})$ time.
We study the online variant of the Min-Sum Set Cover (MSSC) problem, a generalization of the well-known list update problem. In the MSSC problem, an algorithm has to maintain the time-varying permutation of the list of $n$ elements, and serve a sequence of requests $R_1, R_2, \dots, R_t, \dots$. Each $R_t$ is a subset of elements of cardinality at most $r$. For a requested set $R_t$, an online algorithm has to pay the cost equal to the position of the first element from $R_t$ on its list. Then, it may arbitrarily permute its list, paying the number of swapped adjacent element pairs. We present the first constructive deterministic algorithm for this problem, whose competitive ratio does not depend on $n$. Our algorithm is $O(r^2)$-competitive, which beats both the existential upper bound of $O(r^4)$ by Bienkowski and Mucha [AAAI '23] and the previous constructive bound of $O(r^{3/2} \cdot \sqrt{n})$ by Fotakis et al. [ICALP '20]. Furthermore, we show that our algorithm attains an asymptotically optimal competitive ratio of $O(r)$ when compared to the best fixed permutation of elements.
In this paper we investigate the existence of subexponential parameterized algorithms of three fundamental cycle-hitting problems in geometric graph classes. The considered problems, \textsc{Triangle Hitting} (TH), \textsc{Feedback Vertex Set} (FVS), and \textsc{Odd Cycle Transversal} (OCT) ask for the existence in a graph $G$ of a set $X$ of at most $k$ vertices such that $G-X$ is, respectively, triangle-free, acyclic, or bipartite. Such subexponential parameterized algorithms are known to exist in planar and even $H$-minor free graphs from bidimensionality theory [Demaine et al., JACM 2005], and there is a recent line of work lifting these results to geometric graph classes consisting of intersection of "fat" objects ([Grigoriev et al., FOCS 2022] and [Lokshtanov et al., SODA 2022]). In this paper we focus on "thin" objects by considering intersection graphs of segments in the plane with $d$ possible slopes ($d$-DIR graphs) and contact graphs of segments in the plane. Assuming the ETH, we rule out the existence of algorithms: - solving TH in time $2^{o(n)}$ in 2-DIR graphs; and - solving TH, FVS, and OCT in time $2^{o(\sqrt{n})}$ in $K_{2,2}$-free contact 2-DIR graphs. These results indicate that additional restrictions are necessary in order to obtain subexponential parameterized algorithms for %these problems. In this direction we provide: - a $2^{O(k^{3/4}\cdot \log k)}n^{O(1)}$-time algorithm for FVS in contact segment graphs; - a $2^{O(\sqrt d\cdot t^2 \log t\cdot k^{2/3}\log k)} n^{O(1)}$-time algorithm for TH in $K_{t,t}$-free $d$-DIR graphs; and - a $2^{O(k^{7/9}\log^{3/2}k)} n^{O(1)}$-time algorithm for TH in contact segment graphs.
Consider a mechanism that cannot observe how many players there are directly, but instead must rely on their self-reports to know how many are participating. Suppose the players can create new identities to report to the auctioneer at some cost $c$. The usual mechanism design paradigm is equivalent to implicitly assuming that $c$ is infinity for all players, while the usual Sybil attacks literature is that it is zero or finite for one player (the attacker) and infinity for everyone else (the 'honest' players). The false-name proof literature largely assumes the cost to be 0. We consider a model with variable costs that unifies these disparate streams. A paradigmatic normal form game can be extended into a Sybil game by having the action space by the product of the feasible set of identities to create action where each player chooses how many players to present as in the game and their actions in the original normal form game. A mechanism is (dominant) false-name proof if it is (dominant) incentive-compatible for all the players to self-report as at most one identity. We study mechanisms proposed in the literature motivated by settings where anonymity and self-identification are the norms, and show conditions under which they are not Sybil-proof. We characterize a class of dominant Sybil-proof mechanisms for reward sharing and show that they achieve the efficiency upper bound. We consider the extension when agents can credibly commit to the strategy of their sybils and show how this can break mechanisms that would otherwise be false-name proof.
In this paper, we study the maximum clique problem on hyperbolic random graphs. A hyperbolic random graph is a mathematical model for analyzing scale-free networks since it effectively explains the power-law degree distribution of scale-free networks. We propose a simple algorithm for finding a maximum clique in hyperbolic random graph. We first analyze the running time of our algorithm theoretically. We can compute a maximum clique on a hyperbolic random graph $G$ in $O(m + n^{4.5(1-\alpha)})$ expected time if a geometric representation is given or in $O(m + n^{6(1-\alpha)})$ expected time if a geometric representation is not given, where $n$ and $m$ denote the numbers of vertices and edges of $G$, respectively, and $\alpha$ denotes a parameter controlling the power-law exponent of the degree distribution of $G$. Also, we implemented and evaluated our algorithm empirically. Our algorithm outperforms the previous algorithm [BFK18] practically and theoretically. Beyond the hyperbolic random graphs, we have experiment on real-world networks. For most of instances, we get large cliques close to the optimum solutions efficiently.
We consider geometric problems on planar $n^2$-point sets in the congested clique model. Initially, each node in the $n$-clique network holds a batch of $n$ distinct points in the Euclidean plane given by $O(\log n)$-bit coordinates. In each round, each node can send a distinct $O(\log n)$-bit message to each other node in the clique and perform unlimited local computations. We show that the convex hull of the input $n^2$-point set can be constructed in $O(\min\{ h,\log n\})$ rounds, where $h$ is the size of the hull, on the congested clique. We also show that a triangulation of the input $n^2$-point set can be constructed in $O(\log^2n)$ rounds on the congested clique. Finally, we demonstrate that the Voronoi diagram of $n^2$ points with $O(\log n)$-bit coordinates drawn uniformly at random from a unit square can be computed within the square with high probability in $O(1)$ rounds on the congested clique.
We consider the problem of approximating a $d \times d$ covariance matrix $M$ with a rank-$k$ matrix under $(\varepsilon,\delta)$-differential privacy. We present and analyze a complex variant of the Gaussian mechanism and show that the Frobenius norm of the difference between the matrix output by this mechanism and the best rank-$k$ approximation to $M$ is bounded by roughly $\tilde{O}(\sqrt{kd})$, whenever there is an appropriately large gap between the $k$'th and the $k+1$'th eigenvalues of $M$. This improves on previous work that requires that the gap between every pair of top-$k$ eigenvalues of $M$ is at least $\sqrt{d}$ for a similar bound. Our analysis leverages the fact that the eigenvalues of complex matrix Brownian motion repel more than in the real case, and uses Dyson's stochastic differential equations governing the evolution of its eigenvalues to show that the eigenvalues of the matrix $M$ perturbed by complex Gaussian noise have large gaps with high probability. Our results contribute to the analysis of low-rank approximations under average-case perturbations and to an understanding of eigenvalue gaps for random matrices, which may be of independent interest.
A general, {\em rectangular} kernel matrix may be defined as $K_{ij} = \kappa(x_i,y_j)$ where $\kappa(x,y)$ is a kernel function and where $X=\{x_i\}_{i=1}^m$ and $Y=\{y_i\}_{i=1}^n$ are two sets of points. In this paper, we seek a low-rank approximation to a kernel matrix where the sets of points $X$ and $Y$ are large and are arbitrarily distributed, such as away from each other, ``intermingled'', identical, etc. Such rectangular kernel matrices may arise, for example, in Gaussian process regression where $X$ corresponds to the training data and $Y$ corresponds to the test data. In this case, the points are often high-dimensional. Since the point sets are large, we must exploit the fact that the matrix arises from a kernel function, and avoid forming the matrix, and thus ruling out most algebraic techniques. In particular, we seek methods that can scale linearly or nearly linear with respect to the size of data for a fixed approximation rank. The main idea in this paper is to {\em geometrically} select appropriate subsets of points to construct a low rank approximation. An analysis in this paper guides how this selection should be performed.
Quantization is commonly used in Deep Neural Networks (DNNs) to reduce the storage and computational complexity by decreasing the arithmetical precision of activations and weights, a.k.a. tensors. Efficient hardware architectures employ linear quantization to enable the deployment of recent DNNs onto embedded systems and mobile devices. However, linear uniform quantization cannot usually reduce the numerical precision to less than 8 bits without sacrificing high performance in terms of model accuracy. The performance loss is due to the fact that tensors do not follow uniform distributions. In this paper, we show that a significant amount of tensors fit into an exponential distribution. Then, we propose DNA-TEQ to exponentially quantize DNN tensors with an adaptive scheme that achieves the best trade-off between numerical precision and accuracy loss. The experimental results show that DNA-TEQ provides a much lower quantization bit-width compared to previous proposals, resulting in an average compression ratio of 40% over the linear INT8 baseline, with negligible accuracy loss and without retraining the DNNs. Besides, DNA-TEQ leads the way in performing dot-product operations in the exponential domain, which saves 66% of energy consumption on average for a set of widely used DNNs.