We consider the problem of approximating a $d \times d$ covariance matrix $M$ with a rank-$k$ matrix under $(\varepsilon,\delta)$-differential privacy. We present and analyze a complex variant of the Gaussian mechanism and show that the Frobenius norm of the difference between the matrix output by this mechanism and the best rank-$k$ approximation to $M$ is bounded by roughly $\tilde{O}(\sqrt{kd})$, whenever there is an appropriately large gap between the $k$'th and the $k+1$'th eigenvalues of $M$. This improves on previous work that requires that the gap between every pair of top-$k$ eigenvalues of $M$ is at least $\sqrt{d}$ for a similar bound. Our analysis leverages the fact that the eigenvalues of complex matrix Brownian motion repel more than in the real case, and uses Dyson's stochastic differential equations governing the evolution of its eigenvalues to show that the eigenvalues of the matrix $M$ perturbed by complex Gaussian noise have large gaps with high probability. Our results contribute to the analysis of low-rank approximations under average-case perturbations and to an understanding of eigenvalue gaps for random matrices, which may be of independent interest.
The {\em discrepancy} of a matrix $M \in \mathbb{R}^{d \times n}$ is given by $\mathrm{DISC}(M) := \min_{\boldsymbol{x} \in \{-1,1\}^n} \|M\boldsymbol{x}\|_\infty$. An outstanding conjecture, attributed to Koml\'os, stipulates that $\mathrm{DISC}(M) = O(1)$, whenever $M$ is a Koml\'os matrix, that is, whenever every column of $M$ lies within the unit sphere. Our main result asserts that $\mathrm{DISC}(M + R/\sqrt{d}) = O(d^{-1/2})$ holds asymptotically almost surely, whenever $M \in \mathbb{R}^{d \times n}$ is Koml\'os, $R \in \mathbb{R}^{d \times n}$ is a Rademacher random matrix, $d = \omega(1)$, and $n = \tilde \omega(d^{5/4})$. We conjecture that $n = \omega(d \log d)$ suffices for the same assertion to hold. The factor $d^{-1/2}$ normalising $R$ is essentially best possible.
We consider both the classical and quantum variations of $X$-secure, $E$-eavesdropped and $T$-colluding symmetric private information retrieval (SPIR). This is the first work to study SPIR with $X$-security in classical or quantum variations. We first develop a scheme for classical $X$-secure, $E$-eavesdropped and $T$-colluding SPIR (XSETSPIR) based on a modified version of cross subspace alignment (CSA), which achieves a rate of $R= 1 - \frac{X+\max(T,E)}{N}$. The modified scheme achieves the same rate as the scheme used for $X$-secure PIR with the extra benefit of symmetric privacy. Next, we extend this scheme to its quantum counterpart based on the $N$-sum box abstraction. This is the first work to consider the presence of eavesdroppers in quantum private information retrieval (QPIR). In the quantum variation, the eavesdroppers have better access to information over the quantum channel compared to the classical channel due to the over-the-air decodability. To that end, we develop another scheme specialized to combat eavesdroppers over quantum channels. The scheme proposed for $X$-secure, $E$-eavesdropped and $T$-colluding quantum SPIR (XSETQSPIR) in this work maintains the super-dense coding gain from the shared entanglement between the databases, i.e., achieves a rate of $R_Q = \min\left\{ 1, 2\left(1-\frac{X+\max(T,E)}{N}\right)\right\}$.
Given a matrix $M\in \mathbb{R}^{m\times n}$, the low rank matrix completion problem asks us to find a rank-$k$ approximation of $M$ as $UV^\top$ for $U\in \mathbb{R}^{m\times k}$ and $V\in \mathbb{R}^{n\times k}$ by only observing a few entries specified by a set of entries $\Omega\subseteq [m]\times [n]$. In particular, we examine an approach that is widely used in practice -- the alternating minimization framework. Jain, Netrapalli and Sanghavi~\cite{jns13} showed that if $M$ has incoherent rows and columns, then alternating minimization provably recovers the matrix $M$ by observing a nearly linear in $n$ number of entries. While the sample complexity has been subsequently improved~\cite{glz17}, alternating minimization steps are required to be computed exactly. This hinders the development of more efficient algorithms and fails to depict the practical implementation of alternating minimization, where the updates are usually performed approximately in favor of efficiency. In this paper, we take a major step towards a more efficient and error-robust alternating minimization framework. To this end, we develop an analytical framework for alternating minimization that can tolerate moderate amount of errors caused by approximate updates. Moreover, our algorithm runs in time $\widetilde O(|\Omega| k)$, which is nearly linear in the time to verify the solution while preserving the sample complexity. This improves upon all prior known alternating minimization approaches which require $\widetilde O(|\Omega| k^2)$ time.
We consider an inverse problem $\mathbf{y}= f(\mathbf{Ax})$, where $\mathbf{x}\in\mathbb{R}^n$ is the signal of interest, $\mathbf{A}$ is the sensing matrix, $f$ is a nonlinear function and $\mathbf{y} \in \mathbb{R}^m$ is the measurement vector. In many applications, we have some level of freedom to design the sensing matrix $\mathbf{A}$, and in such circumstances we could optimize $\mathbf{A}$ to achieve better reconstruction performance. As a first step towards optimal design, it is important to understand the impact of the sensing matrix on the difficulty of recovering $\mathbf{x}$ from $\mathbf{y}$. In this paper, we study the performance of one of the most successful recovery methods, i.e., the expectation propagation (EP) algorithm. We define a notion of spikiness for the spectrum of $\bmmathbfA}$ and show the importance of this measure for the performance of EP. We show that whether a spikier spectrum can hurt or help the recovery performance depends on $f$. Based on our framework, we are able to show that, in phase-retrieval problems, matrices with spikier spectrums are better for EP, while in 1-bit compressed sensing problems, less spiky spectrums lead to better performance. Our results unify and substantially generalize existing results that compare Gaussian and orthogonal matrices, and provide a platform towards designing optimal sensing systems.
A $\mu$-constrained Boolean Max-CSP$(\psi)$ instance is a Boolean Max-CSP instance on predicate $\psi:\{0,1\}^r \to \{0,1\}$ where the objective is to find a labeling of relative weight exactly $\mu$ that maximizes the fraction of satisfied constraints. In this work, we study the approximability of constrained Boolean Max-CSPs via SDP hierarchies by relating the integrality gap of Max-CSP $(\psi)$ to its $\mu$-dependent approximation curve. Formally, assuming the Small-Set Expansion Hypothesis, we show that it is NP-hard to approximate $\mu$-constrained instances of Max-CSP($\psi$) up to factor ${\sf Gap}_{\ell,\mu}(\psi)/\log(1/\mu)^2$ (ignoring factors depending on $r$) for any $\ell \geq \ell(\mu,r)$. Here, ${\sf Gap}_{\ell,\mu}(\psi)$ is the optimal integrality gap of $\ell$-round Lasserre relaxation for $\mu$-constrained Max-CSP($\psi$) instances. Our results are derived by combining the framework of Raghavendra [STOC 2008] along with more recent advances in rounding Lasserre relaxations and reductions from the Small-Set Expansion (SSE) problem. A crucial component of our reduction is a novel way of composing generic bias-dependent dictatorship tests with SSE, which could be of independent interest.
Let $\mathcal{T}$ be a set of $n$ flat (planar) semi-algebraic regions in $\mathbb{R}^3$ of constant complexity (e.g., triangles, disks), which we call plates. We wish to preprocess $\mathcal{T}$ into a data structure so that for a query object $\gamma$, which is also a plate, we can quickly answer various intersection queries, such as detecting whether $\gamma$ intersects any plate of $\mathcal{T}$, reporting all the plates intersected by $\gamma$, or counting them. We also consider two simpler cases of this general setting: (i) the input objects are plates and the query objects are constant-degree parametrized algebraic arcs in $\mathbb{R}^3$ (arcs, for short), or (ii) the input objects are arcs and the query objects are plates in $\mathbb{R}^3$. Besides being interesting in their own right, the data structures for these two special cases form the building blocks for handling the general case. By combining the polynomial-partitioning technique with additional tools from real algebraic geometry, we present many different data structures for intersection queries, which also provide trade-offs between their size and query time. For example, if $\mathcal{T}$ is a set of plates and the query objects are algebraic arcs, we obtain a data structure that uses $O^*(n^{4/3})$ storage (where the $O^*(\cdot)$ notation hides subpolynomial factors) and answers an arc-intersection query in $O^*(n^{2/3})$ time. This result is significant since the exponents do not depend on the specific shape of the input and query objects. For a parameter $s\in [n^{4/3}, n^{t_Q}]$ where $t_Q\ge 3$ is the number of real parameters needed to specify a query arc, the query time can be decreased to $O^*((n/s^{1/t_Q})^{\tfrac{2}{3}(1-1/t_Q)})$ by increasing the storage to $O^*(s)$.
Given $n$ observations from two balanced classes, consider the task of labeling an additional $m$ inputs that are known to all belong to \emph{one} of the two classes. Special cases of this problem are well-known: with complete knowledge of class distributions ($n=\infty$) the problem is solved optimally by the likelihood-ratio test; when $m=1$ it corresponds to binary classification; and when $m\approx n$ it is equivalent to two-sample testing. The intermediate settings occur in the field of likelihood-free inference, where labeled samples are obtained by running forward simulations and the unlabeled sample is collected experimentally. In recent work it was discovered that there is a fundamental trade-off between $m$ and $n$: increasing the data sample $m$ reduces the amount $n$ of training/simulation data needed. In this work we (a) introduce a generalization where unlabeled samples come from a mixture of the two classes -- a case often encountered in practice; (b) study the minimax sample complexity for non-parametric classes of densities under \textit{maximum mean discrepancy} (MMD) separation; and (c) investigate the empirical performance of kernels parameterized by neural networks on two tasks: detection of the Higgs boson and detection of planted DDPM generated images amidst CIFAR-10 images. For both problems we confirm the existence of the theoretically predicted asymmetric $m$ vs $n$ trade-off.
We analyze union-find using potential functions motivated by continuous algorithms, and give alternate proofs of the $O(\log\log{n})$, $O(\log^{*}n)$, $O(\log^{**}n)$, and $O(\alpha(n))$ amortized cost upper bounds. The proof of the $O(\log\log{n})$ amortized bound goes as follows. Let each node's potential be the square root of its size, i.e., the size of the subtree rooted from it. The overall potential increase is $O(n)$ because the node sizes increase geometrically along any tree path. When compressing a path, each node on the path satisfies that either its potential decreases by $\Omega(1)$, or its child's size along the path is less than the square root of its size: this can happen at most $O(\log\log{n})$ times along any tree path.
We give an algorithm that takes as input an $n$-vertex graph $G$ and an integer $k$, runs in time $2^{O(k^2)} n^{O(1)}$, and outputs a tree decomposition of $G$ of width at most $k$, if such a decomposition exists. This resolves the long-standing open problem of whether there is a $2^{o(k^3)} n^{O(1)}$ time algorithm for treewidth. In particular, our algorithm is the first improvement on the dependency on $k$ in algorithms for treewidth since the $2^{O(k^3)} n^{O(1)}$ time algorithm given by Bodlaender and Kloks [ICALP 1991] and Lagergren and Arnborg [ICALP 1991]. We also give an algorithm that given an $n$-vertex graph $G$, an integer $k$, and a rational $\varepsilon \in (0,1)$, in time $k^{O(k/\varepsilon)} n^{O(1)}$ either outputs a tree decomposition of $G$ of width at most $(1+\varepsilon)k$ or determines that the treewidth of $G$ is larger than $k$. Prior to our work, no approximation algorithms for treewidth with approximation ratio less than $2$, other than the exact algorithms, were known. Both of our algorithms work in polynomial space.
Chain-of-Though (CoT) prompting has shown promising performance in various reasoning tasks. Recently, Self-Consistency \citep{wang2023selfconsistency} proposes to sample a diverse set of reasoning chains which may lead to different answers while the answer that receives the most votes is selected. In this paper, we propose a novel method to use backward reasoning in verifying candidate answers. We mask a token in the question by ${\bf x}$ and ask the LLM to predict the masked token when a candidate answer is provided by \textit{a simple template}, i.e., ``\textit{\textbf{If we know the answer of the above question is \{a candidate answer\}, what is the value of unknown variable ${\bf x}$?}}'' Intuitively, the LLM is expected to predict the masked token successfully if the provided candidate answer is correct. We further propose FOBAR to combine forward and backward reasoning for estimating the probability of candidate answers. We conduct extensive experiments on six data sets and three LLMs. Experimental results demonstrate that FOBAR achieves state-of-the-art performance on various reasoning benchmarks.