In this paper, we consider the \emph{planar two-center problem}: Given a set $S$ of $n$ points in the plane, the goal is to find two smallest congruent disks whose union contains all points of $S$. We present an $O(n\log n)$-time algorithm for the planar two-center problem. This matches the best known lower bound of $\Omega(n\log n)$ as well as improving the previously best known algorithms which takes $O(n\log^2 n)$ time.
We consider the capacitated clustering problem in general metric spaces where the goal is to identify $k$ clusters and minimize the sum of the radii of the clusters (we call this the Capacitated-$k$-sumRadii problem). We are interested in fixed-parameter tractable (FPT) approximation algorithms where the running time is of the form $f(k) \cdot \text{poly}(n)$, where $f(k)$ can be an exponential function of $k$ and $n$ is the number of points in the input. In the uniform capacity case, Bandyapadhyay et al. recently gave a $4$-approximation algorithm for this problem. Our first result improves this to an FPT $3$-approximation and extends to a constant factor approximation for any $L_p$ norm of the cluster radii. In the general capacities version, Bandyapadhyay et al. gave an FPT $15$-approximation algorithm. We extend their framework to give an FPT $(4 + \sqrt{13})$-approximation algorithm for this problem. Our framework relies on a novel idea of identifying approximations to optimal clusters by carefully pruning points from an initial candidate set of points. This is in contrast to prior results that rely on guessing suitable points and building balls of appropriate radii around them. On the hardness front, we show that assuming the Exponential Time Hypothesis, there is a constant $c > 1$ such that any $c$-approximation algorithm for the non-uniform capacity version of this problem requires running time $2^{\Omega \left(\frac{k}{polylog(k)} \right)}$.
Let $\mathbf S \in \mathbb R^{n \times n}$ satisfy $\|\mathbf 1-\mathbf S\|_2\le\epsilon n$, where $\mathbf 1$ is the all ones matrix and $\|\cdot\|_2$ is the spectral norm. It is well-known that there exists such an $\mathbf S$ with just $O(n/\epsilon^2)$ non-zero entries: we can let $\mathbf S$ be the scaled adjacency matrix of a Ramanujan expander graph. We show that such an $\mathbf S$ yields a $universal$ $sparsifier$ for any positive semidefinite (PSD) matrix. In particular, for any PSD $\mathbf A \in \mathbb{R}^{n\times n}$ with entries bounded in magnitude by $1$, $\|\mathbf A - \mathbf A\circ\mathbf S\|_2 \le \epsilon n$, where $\circ$ denotes the entrywise (Hadamard) product. Our techniques also give universal sparsifiers for non-PSD matrices. In this case, letting $\mathbf S$ be the scaled adjacency matrix of a Ramanujan graph with $\tilde O(n/\epsilon^4)$ edges, we have $\|\mathbf A - \mathbf A \circ \mathbf S \|_2 \le \epsilon \cdot \max(n,\|\mathbf A\|_1)$, where $\|\mathbf A\|_1$ is the nuclear norm. We show that the above bounds for both PSD and non-PSD matrices are tight up to log factors. Since $\mathbf A \circ \mathbf S$ can be constructed deterministically, our result for PSD matrices derandomizes and improves upon known results for randomized matrix sparsification, which require randomly sampling ${O}(\frac{n \log n}{\epsilon^2})$ entries. We also leverage our results to give the first deterministic algorithms for several problems related to singular value approximation that run in faster than matrix multiplication time. Finally, if $\mathbf A \in \{-1,0,1\}^{n \times n}$ is PSD, we show that $\mathbf{\tilde A}$ with $\|\mathbf A - \mathbf{\tilde A}\|_2 \le \epsilon n$ can be obtained by deterministically reading $\tilde O(n/\epsilon)$ entries of $\mathbf A$. This improves the $1/\epsilon$ dependence on our result for general PSD matrices and is near-optimal.
Given a closed simple polygon $P$, we say two points $p,q$ see each other if the segment $pq$ is fully contained in $P$. The art gallery problem seeks a minimum size set $G\subset P$ of guards that sees $P$ completely. The only currently correct algorithm to solve the art gallery problem exactly uses algebraic methods and is attributed to Sharir. As the art gallery problem is ER-complete, it seems unlikely to avoid algebraic methods, without additional assumptions. In this paper, we introduce the notion of vision stability. In order to describe vision stability consider an enhanced guard that can see "around the corner" by an angle of $\delta$ or a diminished guard whose vision is by an angle of $\delta$ "blocked" by reflex vertices. A polygon $P$ has vision stability $\delta$ if the optimal number of enhanced guards to guard $P$ is the same as the optimal number of diminished guards to guard $P$. We will argue that most relevant polygons are vision stable. We describe a one-shot vision stable algorithm that computes an optimal guard set for visionstable polygons using polynomial time and solving one integer program. It guarantees to find the optimal solution for every vision stable polygon. We implemented an iterative visionstable algorithm and show its practical performance is slower, but comparable with other state of the art algorithms. Our iterative algorithm is inspired and follows closely the one-shot algorithm. It delays several steps and only computes them when deemed necessary. Given a chord $c$ of a polygon, we denote by $n(c)$ the number of vertices visible from $c$. The chord-width of a polygon is the maximum $n(c)$ over all possible chords $c$. The set of vision stable polygons admits an FPT algorithm when parametrized by the chord-width. Furthermore, the one-shot algorithm runs in FPT time, when parameterized by the number of reflex vertices.
This work concerns elementwise-transformations of spiked matrices: $Y_n = n^{-1/2} f( \sqrt{n} X_n + Z_n)$. Here, $f$ is a function applied elementwise, $X_n$ is a low-rank signal matrix, and $Z_n$ is white noise. We find that principal component analysis is powerful for recovering signal under highly nonlinear or discontinuous transformations. Specifically, in the high-dimensional setting where $Y_n$ is of size $n \times p$ with $n,p \rightarrow \infty$ and $p/n \rightarrow \gamma > 0$, we uncover a phase transition: for signal-to-noise ratios above a sharp threshold -- depending on $f$, the distribution of elements of $Z_n$, and the limiting aspect ratio $\gamma$ -- the principal components of $Y_n$ (partially) recover those of $X_n$. Below this threshold, the principal components of $Y_n$ are asymptotically orthogonal to the signal. In contrast, in the standard setting where $X_n + n^{-1/2}Z_n$ is observed directly, the analogous phase transition depends only on $\gamma$. A similar phenomenon occurs with $X_n$ square and symmetric and $Z_n$ a generalized Wigner matrix.
We are given a set $\mathcal{Z}=\{(R_1,s_1),\ldots, (R_n,s_n)\}$, where each $R_i$ is a \emph{range} in $\Re^d$, such as rectangle or ball, and $s_i \in [0,1]$ denotes its \emph{selectivity}. The goal is to compute a small-size \emph{discrete data distribution} $\mathcal{D}=\{(q_1,w_1),\ldots, (q_m,w_m)\}$, where $q_j\in \Re^d$ and $w_j\in [0,1]$ for each $1\leq j\leq m$, and $\sum_{1\leq j\leq m}w_j= 1$, such that $\mathcal{D}$ is the most \emph{consistent} with $\mathcal{Z}$, i.e., $\mathrm{err}_p(\mathcal{D},\mathcal{Z})=\frac{1}{n}\sum_{i=1}^n\! \lvert{s_i-\sum_{j=1}^m w_j\cdot 1(q_j\in R_i)}\rvert^p$ is minimized. In a database setting, $\mathcal{Z}$ corresponds to a workload of range queries over some table, together with their observed selectivities (i.e., fraction of tuples returned), and $\mathcal{D}$ can be used as compact model for approximating the data distribution within the table without accessing the underlying contents. In this paper, we obtain both upper and lower bounds for this problem. In particular, we show that the problem of finding the best data distribution from selectivity queries is $\mathsf{NP}$-complete. On the positive side, we describe a Monte Carlo algorithm that constructs, in time $O((n+\delta^{-d})\delta^{-2}\mathop{\mathrm{polylog}})$, a discrete distribution $\tilde{\mathcal{D}}$ of size $O(\delta^{-2})$, such that $\mathrm{err}_p(\tilde{\mathcal{D}},\mathcal{Z})\leq \min_{\mathcal{D}}\mathrm{err}_p(\mathcal{D},\mathcal{Z})+\delta$ (for $p=1,2,\infty$) where the minimum is taken over all discrete distributions. We also establish conditional lower bounds, which strongly indicate the infeasibility of relative approximations as well as removal of the exponential dependency on the dimension for additive approximations. This suggests that significant improvements to our algorithm are unlikely.
We prove that a polynomial fraction of the set of $k$-component forests in the $m \times n$ grid graph have equal numbers of vertices in each component, for any constant $k$. This resolves a conjecture of Charikar, Liu, Liu, and Vuong, and establishes the first provably polynomial-time algorithm for (exactly or approximately) sampling balanced grid graph partitions according to the spanning tree distribution, which weights each $k$-partition according to the product, across its $k$ pieces, of the number of spanning trees of each piece. Our result follows from a careful analysis of the probability a uniformly random spanning tree of the grid can be cut into balanced pieces. Beyond grids, we show that for a broad family of lattice-like graphs, we achieve balance up to any multiplicative $(1 \pm \varepsilon)$ constant with constant probability, and up to an additive constant with polynomial probability. More generally, we show that, with constant probability, components derived from uniform spanning trees can approximate any given partition of a planar region specified by Jordan curves. These results imply polynomial time algorithms for sampling approximately balanced tree-weighted partitions for lattice-like graphs. Our results have applications to understanding political districtings, where there is an underlying graph of indivisible geographic units that must be partitioned into $k$ population-balanced connected subgraphs. In this setting, tree-weighted partitions have interesting geometric properties, and this has stimulated significant effort to develop methods to sample them.
In this paper, we study the problem of estimating the normalizing constant $\int e^{-\lambda f(x)}dx$ through queries to the black-box function $f$, where $f$ belongs to a reproducing kernel Hilbert space (RKHS), and $\lambda$ is a problem parameter. We show that to estimate the normalizing constant within a small relative error, the level of difficulty depends on the value of $\lambda$: When $\lambda$ approaches zero, the problem is similar to Bayesian quadrature (BQ), while when $\lambda$ approaches infinity, the problem is similar to Bayesian optimization (BO). More generally, the problem varies between BQ and BO. We find that this pattern holds true even when the function evaluations are noisy, bringing new aspects to this topic. Our findings are supported by both algorithm-independent lower bounds and algorithmic upper bounds, as well as simulation studies conducted on a variety of benchmark functions.
We consider the performance of a least-squares regression model, as judged by out-of-sample $R^2$. Shapley values give a fair attribution of the performance of a model to its input features, taking into account interdependencies between features. Evaluating the Shapley values exactly requires solving a number of regression problems that is exponential in the number of features, so a Monte Carlo-type approximation is typically used. We focus on the special case of least-squares regression models, where several tricks can be used to compute and evaluate regression models efficiently. These tricks give a substantial speed up, allowing many more Monte Carlo samples to be evaluated, achieving better accuracy. We refer to our method as least-squares Shapley performance attribution (LS-SPA), and describe our open-source implementation.
While existing work in robust deep learning has focused on small pixel-level $\ell_p$ norm-based perturbations, this may not account for perturbations encountered in several real world settings. In many such cases although test data might not be available, broad specifications about the types of perturbations (such as an unknown degree of rotation) may be known. We consider a setup where robustness is expected over an unseen test domain that is not i.i.d. but deviates from the training domain. While this deviation may not be exactly known, its broad characterization is specified a priori, in terms of attributes. We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space, without having access to the data from the test domain. Our adversarial training solves a min-max optimization problem, with the inner maximization generating adversarial perturbations, and the outer minimization finding model parameters by optimizing the loss on adversarial perturbations generated from the inner maximization. We demonstrate the applicability of our approach on three types of naturally occurring perturbations -- object-related shifts, geometric transformations, and common image corruptions. Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations. We demonstrate the usefulness of the proposed approach by showing the robustness gains of deep neural networks trained using our adversarial training on MNIST, CIFAR-10, and a new variant of the CLEVR dataset.
Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.