It is well-known that an algorithm exists which approximates the NP-complete problem of Set Cover within a factor of ln(n), and it was recently proven that this approximation ratio is optimal unless P = NP. This optimality result is the product of many advances in characterizations of NP, in terms of interactive proof systems and probabilistically checkable proofs (PCP), and improvements to the analyses thereof. However, as a result, it is difficult to extract the development of Set Cover approximation bounds from the greater scope of proof system analysis. This paper attempts to present a chronological progression of results on lower-bounding the approximation ratio of Set Cover. We analyze a series of proofs of progressively better bounds and unify the results under similar terminologies and frameworks to provide an accurate comparison of proof techniques and their results. We also treat many preliminary results as black-boxes to better focus our analysis on the core reductions to Set Cover instances. The result is alternative versions of several hardness proofs, beginning with initial inapproximability results and culminating in a version of the proof that ln(n) is a tight lower bound.
Recent approaches in the theoretical analysis of model-based deep learning architectures have studied the convergence of gradient descent in shallow ReLU networks that arise from generative models whose hidden layers are sparse. Motivated by the success of architectures that impose structured forms of sparsity, we introduce and study a group-sparse autoencoder that accounts for a variety of generative models, and utilizes a group-sparse ReLU activation function to force the non-zero units at a given layer to occur in blocks. For clustering models, inputs that result in the same group of active units belong to the same cluster. We proceed to analyze the gradient dynamics of a shallow instance of the proposed autoencoder, trained with data adhering to a group-sparse generative model. In this setting, we theoretically prove the convergence of the network parameters to a neighborhood of the generating matrix. We validate our model through numerical analysis and highlight the superior performance of networks with a group-sparse ReLU compared to networks that utilize traditional ReLUs, both in sparse coding and in parameter recovery tasks. We also provide real data experiments to corroborate the simulated results, and emphasize the clustering capabilities of structured sparsity models.
We consider the problem of finding tuned regularized parameter estimators for linear models. We start by showing that three known optimal linear estimators belong to a wider class of estimators that can be formulated as a solution to a weighted and constrained minimization problem. The optimal weights, however, are typically unknown in many applications. This begs the question, how should we choose the weights using only the data? We propose using the covariance fitting SPICE-methodology to obtain data-adaptive weights and show that the resulting class of estimators yields tuned versions of known regularized estimators - such as ridge regression, LASSO, and regularized least absolute deviation. These theoretical results unify several important estimators under a common umbrella. The resulting tuned estimators are also shown to be practically relevant by means of a number of numerical examples.
Hamiltonian cycles in graphs were first studied in the 1850s. Since then, an impressive amount of research has been dedicated to identifying classes of graphs that allow Hamiltonian cycles, and to related questions. The corresponding decision problem, that asks whether a given graph is Hamiltonian (i.\,e.\ admits a Hamiltonian cycle), is one of Karp's famous NP-complete problems. In this paper we study graphs of bounded degree that are \emph{far} from being Hamiltonian, where a graph $G$ on $n$ vertices is \emph{far} from being Hamiltonian, if modifying a constant fraction of $n$ edges is necessary to make $G$ Hamiltonian. We give an explicit deterministic construction of a class of graphs of bounded degree that are locally Hamiltonian, but (globally) far from being Hamiltonian. Here, \emph{locally Hamiltonian} means that every subgraph induced by the neighbourhood of a small vertex set appears in some Hamiltonian graph. More precisely, we obtain graphs which differ in $\Theta(n)$ edges from any Hamiltonian graph, but non-Hamiltonicity cannot be detected in the neighbourhood of $o(n)$ vertices. Our class of graphs yields a class of hard instances for one-sided error property testers with linear query complexity. It is known that any property tester (even with two-sided error) requires a linear number of queries to test Hamiltonicity (Yoshida, Ito, 2010). This is proved via a randomised construction of hard instances. In contrast, our construction is deterministic. So far only very few deterministic constructions of hard instances for property testing are known. We believe that our construction may lead to future insights in graph theory and towards a characterisation of the properties that are testable in the bounded-degree model.
Continuous determinantal point processes (DPPs) are a class of repulsive point processes on $\mathbb{R}^d$ with many statistical applications. Although an explicit expression of their density is known, it is too complicated to be used directly for maximum likelihood estimation. In the stationary case, an approximation using Fourier series has been suggested, but it is limited to rectangular observation windows and no theoretical results support it. In this contribution, we investigate a different way to approximate the likelihood by looking at its asymptotic behaviour when the observation window grows towards $\mathbb{R}^d$. This new approximation is not limited to rectangular windows, is faster to compute than the previous one, does not require any tuning parameter, and some theoretical justifications are provided. It moreover provides an explicit formula for estimating the asymptotic variance of the associated estimator. The performances are assessed in a simulation study on standard parametric models on $\mathbb{R}^d$ and compare favourably to common alternative estimation methods for continuous DPPs.
This paper studies the spectral estimation problem of estimating the locations of a fixed number of point sources given multiple snapshots of Fourier measurements collected by a uniform array of sensors. We prove novel stability bounds for MUSIC and ESPRIT as a function of the noise standard deviation, number of snapshots, source amplitudes, and support. Our most general result is a perturbation bound of the signal space in terms of the minimum singular value of Fourier matrices. When the point sources are located in several separated clumps, we provide an explicit upper bound of the noise-space correlation perturbation error in MUSIC and the support error in ESPRIT in terms of a Super-Resolution Factor (SRF). The upper bound for ESPRIT is then compared with a new Cram\'er-Rao lower bound for the clumps model. As a result, we show that ESPRIT is comparable to that of the optimal unbiased estimator(s) in terms of the dependence on noise, number of snapshots and SRF. As a byproduct of our analysis, we discover several fundamental differences between the single-snapshot and multi-snapshot problems. Our theory is validated by numerical experiments.
A \emph{general branch-and-bound tree} is a branch-and-bound tree which is allowed to use general disjunctions of the form $\pi^{\top} x \leq \pi_0 \,\vee\, \pi^{\top}x \geq \pi_0 + 1$, where $\pi$ is an integer vector and $\pi_0$ is an integer scalar, to create child nodes. We construct a packing instance, a set covering instance, and a Traveling Salesman Problem instance, such that any general branch-and-bound tree that solves these instances must be of exponential size. We also verify that an exponential lower bound on the size of general branch-and-bound trees persists when we add Gaussian noise to the coefficients of the cross polytope, thus showing that polynomial-size "smoothed analysis" upper bound is not possible. The results in this paper can be viewed as the branch-and-bound analog of the seminal paper by Chv\'atal et al. \cite{chvatal1989cutting}, who proved lower bounds for the Chv\'atal-Gomory rank.
For an integer $k \geq 1$ and a graph $G$, let $\mathcal{K}_k(G)$ be the graph that has vertex set all proper $k$-colorings of $G$, and an edge between two vertices $\alpha$ and~$\beta$ whenever the coloring~$\beta$ can be obtained from $\alpha$ by a single Kempe change. A theorem of Meyniel from 1978 states that $\mathcal{K}_5(G)$ is connected with diameter $O(5^{|V(G)|})$ for every planar graph $G$. We significantly strengthen this result, by showing that there is a positive constant $c$ such that $\mathcal{K}_5(G)$ has diameter $O(|V(G)|^c)$ for every planar graph $G$.
We prove upper and lower bounds on the minimal spherical dispersion, improving upon previous estimates obtained by Rote and Tichy [Spherical dispersion with an application to polygonal approximation of curves, Anz. \"Osterreich. Akad. Wiss. Math.-Natur. Kl. 132 (1995), 3--10]. In particular, we see that the inverse $N(\varepsilon,d)$ of the minimal spherical dispersion is, for fixed $\varepsilon>0$, linear in the dimension $d$ of the ambient space. We also derive upper and lower bounds on the expected dispersion for points chosen independently and uniformly at random from the Euclidean unit sphere. In terms of the corresponding inverse $\widetilde{N}(\varepsilon,d)$, our bounds are optimal with respect to the dependence on $\varepsilon$.
In this paper, we prove a local limit theorem for the chi-square distribution with $r > 0$ degrees of freedom and noncentrality parameter $\lambda \geq 0$. We use it to develop refined normal approximations for the survival function. Our maximal errors go down to an order of $r^{-2}$, which is significantly smaller than the maximal error bounds of order $r^{-1/2}$ recently found by Horgan & Murphy (2013) and Seri (2015). Our results allow us to drastically reduce the number of observations required to obtain negligible errors in the energy detection problem, from $250$, as recommended in the seminal work of Urkowitz (1967), to only $8$ here with our new approximations. We also obtain an upper bound on several probability metrics between the central and noncentral chi-square distributions and the standard normal distribution, and we obtain an approximation for the median that improves the lower bound previously obtained by Robert (1990).
Across research disciplines, cluster randomized trials (CRTs) are commonly implemented to evaluate interventions delivered to groups of participants, such as communities and clinics. Despite advances in the design and analysis of CRTs, several challenges remain. First, there are many possible ways to specify the intervention effect (e.g., at the individual-level or at the cluster-level). Second, the theoretical and practical performance of common methods for CRT analysis remain poorly understood. Here, we use causal models to formally define an array of causal effects as summary measures of counterfactual outcomes. Next, we provide a comprehensive overview of well-known CRT estimators, including the t-test and generalized estimating equations (GEE), as well as less known methods, including augmented-GEE and targeted maximum likelihood estimation (TMLE). In finite sample simulations, we illustrate the performance of these estimators and the importance of effect specification, especially when cluster size varies. Finally, our application to data from the Preterm Birth Initiative (PTBi) study demonstrates the real-world importance of selecting an analytic approach corresponding to the research question. Given its flexibility to estimate a variety of effects and ability to adaptively adjust for covariates for precision gains while maintaining Type-I error control, we conclude TMLE is a promising tool for CRT analysis.