The Super-SAT or SSAT problem was introduced by Dinur et al.(2002,2003) to prove the NP-hardness of approximation of two popular lattice problems - Shortest Vector Problem(SVP) and Closest Vector Problem(CVP). They conjectured that SSAT is NP-hard to approximate to within a factor of $n^c$ ($c>0$ is constant), where $n$ is the size of the SSAT instance. In this paper we prove this conjecture assuming the Projection Games Conjecture(PGC), given by Moshkovitz (2012). This implies hardness of approximation of SVP and CVP within polynomial factors, assuming PGC. We also reduce SSAT to the Nearest Codeword Problem(NCP) and Learning Halfspace Problem(LHP), as considered by Arora et al.(1997). This proves that both these problems are NP-hard to approximate within a factor of $N^{c'/\log\log n}$($c'>0$ is constant) where $N$ is the size of the instances of the respective problems. Assuming PGC these problems are proved to be NP-hard to approximate within polynomial factors.
We revisit the satisfiability problem for two-variable logic, denoted by SAT(FO2), which is known to be NEXP-complete. The upper bound is usually derived from its well known Exponential Size Model (ESM) property. Whether it can be determinized efficiently is still an open question. In this paper we present a different approach by reducing it to a novel graph-theoretic problem that we call Conditional Independent Set (CIS). We show that CIS is NP-complete and present two simple algorithms for it with run time O(1.4423^n) and O(1.6181^n), where n is the number of vertices in the graph. We also show that unless the "Strong Exponential Time Hypothesis" (SETH) fails, there is no algorithm for CIS with run time O(1.4141^n). We show that without the equality predicate SAT(FO2) is in fact equivalent to CIS in succinct representation. This yields two algorithms for SAT(FO2) without the equality predicate with run time O(1.4423^{2^n}) and O(1.6181^{2^n}), where n is the number of predicates. To the best of our knowledge, these are the first exact algorithms for an NEXP-complete decidable logic with run time significantly lower than O(2^{2^n}). We also identify a few lower complexity fragments of FO2 which correspond to the tractable fragments of CIS. Similar to CIS, unless SETH fails, there is no algorithm for SAT(FO2) with run time O(1.4141^{2^n}). For the fragment with the equality predicate, we present a linear time many-one reduction to the fragment without the equality predicate. The reduction yields equi-satisfiable formulas with a small constant blow-up in the number of predicates. Finally, we also perform some small experiments which show that our approach is indeed more promising than the existing method (based on the ESM property). The experiments also show that although theoretically it has the worse run time, the second algorithm in general performs better than the first one.
We present a new data structure to approximate accurately and efficiently a polynomial $f$ of degree $d$ given as a list of coefficients. Its properties allow us to improve the state-of-the-art bounds on the bit complexity for the problems of root isolation and approximate multipoint evaluation. This data structure also leads to a new geometric criterion to detect ill-conditioned polynomials, implying notably that the standard condition number of the zeros of a polynomial is at least exponential in the number of roots of modulus less than $1/2$ or greater than $2$.Given a polynomial $f$ of degree $d$ with $\|f\|_1 \leq 2^\tau$ for $\tau \geq 1$, isolating all its complex roots or evaluating it at $d$ points can be done with a quasi-linear number of arithmetic operations. However, considering the bit complexity, the state-of-the-art algorithms require at least $d^{3/2}$ bit operations even for well-conditioned polynomials and when the accuracy required is low. Given a positive integer $m$, we can compute our new data structure and evaluate $f$ at $d$ points in the unit disk with an absolute error less than $2^{-m}$ in $\widetilde O(d(\tau+m))$ bit operations, where $\widetilde O(\cdot)$ means that we omit logarithmic factors. We also show that if $\kappa$ is the absolute condition number of the zeros of $f$, then we can isolate all the roots of $f$ in $\widetilde O(d(\tau + \log \kappa))$ bit operations. Moreover, our algorithms are simple to implement. For approximating the complex roots of a polynomial, we implemented a small prototype in \verb|Python/NumPy| that is an order of magnitude faster than the state-of-the-art solver \verb/MPSolve/ for high degree polynomials with random coefficients.
We study the Maximum Independent Set (MIS) problem under the notion of stability introduced by Bilu and Linial (2010): a weighted instance of MIS is $\gamma$-stable if it has a unique optimal solution that remains the unique optimum under multiplicative perturbations of the weights by a factor of at most $\gamma\geq 1$. The goal then is to efficiently recover the unique optimal solution. In this work, we solve stable instances of MIS on several graphs classes: we solve $\widetilde{O}(\Delta/\sqrt{\log \Delta})$-stable instances on graphs of maximum degree $\Delta$, $(k - 1)$-stable instances on $k$-colorable graphs and $(1 + \varepsilon)$-stable instances on planar graphs. For general graphs, we present a strong lower bound showing that there are no efficient algorithms for $O(n^{\frac{1}{2} - \varepsilon})$-stable instances of MIS, assuming the planted clique conjecture. We also give an algorithm for $(\varepsilon n)$-stable instances. As a by-product of our techniques, we give algorithms and lower bounds for stable instances of Node Multiway Cut. Furthermore, we prove a general result showing that the integrality gap of convex relaxations of several maximization problems reduces dramatically on stable instances. Moreover, we initiate the study of certified algorithms, a notion recently introduced by Makarychev and Makarychev (2018), which is a class of $\gamma$-approximation algorithms that satisfy one crucial property: the solution returned is optimal for a perturbation of the original instance. We obtain $\Delta$-certified algorithms for MIS on graphs of maximum degree $\Delta$, and $(1+\varepsilon)$-certified algorithms on planar graphs. Finally, we analyze the algorithm of Berman and Furer (1994) and prove that it is a $\left(\frac{\Delta + 1}{3} + \varepsilon\right)$-certified algorithm for MIS on graphs of maximum degree $\Delta$ where all weights are equal to 1.
It is well-known that an algorithm exists which approximates the NP-complete problem of Set Cover within a factor of ln(n), and it was recently proven that this approximation ratio is optimal unless P = NP. This optimality result is the product of many advances in characterizations of NP, in terms of interactive proof systems and probabilistically checkable proofs (PCP), and improvements to the analyses thereof. However, as a result, it is difficult to extract the development of Set Cover approximation bounds from the greater scope of proof system analysis. This paper attempts to present a chronological progression of results on lower-bounding the approximation ratio of Set Cover. We analyze a series of proofs of progressively better bounds and unify the results under similar terminologies and frameworks to provide an accurate comparison of proof techniques and their results. We also treat many preliminary results as black-boxes to better focus our analysis on the core reductions to Set Cover instances. The result is alternative versions of several hardness proofs, beginning with initial inapproximability results and culminating in a version of the proof that ln(n) is a tight lower bound.
Given a simple graph $G$ and an integer $k$, the goal of $k$-Clique problem is to decide if $G$ contains a complete subgraph of size $k$. We say an algorithm approximates $k$-Clique within a factor $g(k)$ if it can find a clique of size at least $k / g(k)$ when $G$ is guaranteed to have a $k$-clique. Recently, it was shown that approximating $k$-Clique within a constant factor is W[1]-hard [Lin21]. We study the approximation of $k$-Clique under the Exponential Time Hypothesis (ETH). The reduction of [Lin21] already implies an $n^{\Omega(\sqrt[6]{\log k})}$-time lower bound under ETH. We improve this lower bound to $n^{\Omega(\log k)}$. Using the gap-amplification technique by expander graphs, we also prove that there is no $k^{o(1)}$ factor FPT-approximation algorithm for $k$-Clique under ETH. We also suggest a new way to prove the Parameterized Inapproximability Hypothesis (PIH) under ETH. We show that if there is no $n^{O(\frac{k}{\log k})}$ algorithm to approximate $k$-Clique within a constant factor, then PIH is true.
We consider the twin problems of estimating the effective rank and the Schatten norms $\|{\bf A}\|_{s}$ of a rectangular $p\times q$ matrix ${\bf A}$ from noisy observations. When $s$ is an even integer, we introduce a polynomial-time estimator of $\|{\bf A}\|_s$ that achieves the minimax rate $(pq)^{1/4}$. Interestingly, this optimal rate does not depend on the underlying rank of the matrix. When $s$ is not an even integer, the optimal rate is much slower. A simple thresholding estimator of the singular values achieves the rate $(q\wedge p)(pq)^{1/4}$, which turns out to be optimal up to a logarithmic multiplicative term. The tight minimax rate is achieved by a more involved polynomial approximation method. This allows us to build estimators for a class of effective rank indices. As a byproduct, we also characterize the minimax rate for estimating the sequence of singular values of a matrix.
The estimation of information measures of continuous distributions based on samples is a fundamental problem in statistics and machine learning. In this paper, we analyze estimates of differential entropy in $K$-dimensional Euclidean space, computed from a finite number of samples, when the probability density function belongs to a predetermined convex family $\mathcal{P}$. First, estimating differential entropy to any accuracy is shown to be infeasible if the differential entropy of densities in $\mathcal{P}$ is unbounded, clearly showing the necessity of additional assumptions. Subsequently, we investigate sufficient conditions that enable confidence bounds for the estimation of differential entropy. In particular, we provide confidence bounds for simple histogram based estimation of differential entropy from a fixed number of samples, assuming that the probability density function is Lipschitz continuous with known Lipschitz constant and known, bounded support. Our focus is on differential entropy, but we provide examples that show that similar results hold for mutual information and relative entropy as well.
This paper studies distributed binary test of statistical independence under communication (information bits) constraints. While testing independence is very relevant in various applications, distributed independence test is particularly useful for event detection in sensor networks where data correlation often occurs among observations of devices in the presence of a signal of interest. By focusing on the case of two devices because of their tractability, we begin by investigating conditions on Type I error probability restrictions under which the minimum Type II error admits an exponential behavior with the sample size. Then, we study the finite sample-size regime of this problem. We derive new upper and lower bounds for the gap between the minimum Type II error and its exponential approximation under different setups, including restrictions imposed on the vanishing Type I error probability. Our theoretical results shed light on the sample-size regimes at which approximations of the Type II error probability via error exponents became informative enough in the sense of predicting well the actual error probability. We finally discuss an application of our results where the gap is evaluated numerically, and we show that exponential approximations are not only tractable but also a valuable proxy for the Type II probability of error in the finite-length regime.
In order to avoid the curse of dimensionality, frequently encountered in Big Data analysis, there was a vast development in the field of linear and nonlinear dimension reduction techniques in recent years. These techniques (sometimes referred to as manifold learning) assume that the scattered input data is lying on a lower dimensional manifold, thus the high dimensionality problem can be overcome by learning the lower dimensionality behavior. However, in real life applications, data is often very noisy. In this work, we propose a method to approximate $\mathcal{M}$ a $d$-dimensional $C^{m+1}$ smooth submanifold of $\mathbb{R}^n$ ($d \ll n$) based upon noisy scattered data points (i.e., a data cloud). We assume that the data points are located "near" the lower dimensional manifold and suggest a non-linear moving least-squares projection on an approximating $d$-dimensional manifold. Under some mild assumptions, the resulting approximant is shown to be infinitely smooth and of high approximation order (i.e., $O(h^{m+1})$, where $h$ is the fill distance and $m$ is the degree of the local polynomial approximation). The method presented here assumes no analytic knowledge of the approximated manifold and the approximation algorithm is linear in the large dimension $n$. Furthermore, the approximating manifold can serve as a framework to perform operations directly on the high dimensional data in a computationally efficient manner. This way, the preparatory step of dimension reduction, which induces distortions to the data, can be avoided altogether.
Many resource allocation problems in the cloud can be described as a basic Virtual Network Embedding Problem (VNEP): finding mappings of request graphs (describing the workloads) onto a substrate graph (describing the physical infrastructure). In the offline setting, the two natural objectives are profit maximization, i.e., embedding a maximal number of request graphs subject to the resource constraints, and cost minimization, i.e., embedding all requests at minimal overall cost. The VNEP can be seen as a generalization of classic routing and call admission problems, in which requests are arbitrary graphs whose communication endpoints are not fixed. Due to its applications, the problem has been studied intensively in the networking community. However, the underlying algorithmic problem is hardly understood. This paper presents the first fixed-parameter tractable approximation algorithms for the VNEP. Our algorithms are based on randomized rounding. Due to the flexible mapping options and the arbitrary request graph topologies, we show that a novel linear program formulation is required. Only using this novel formulation the computation of convex combinations of valid mappings is enabled, as the formulation needs to account for the structure of the request graphs. Accordingly, to capture the structure of request graphs, we introduce the graph-theoretic notion of extraction orders and extraction width and show that our algorithms have exponential runtime in the request graphs' maximal width. Hence, for request graphs of fixed extraction width, we obtain the first polynomial-time approximations. Studying the new notion of extraction orders we show that (i) computing extraction orders of minimal width is NP-hard and (ii) that computing decomposable LP solutions is in general NP-hard, even when restricting request graphs to planar ones.