We prove new lower bounds for statistical estimation tasks under the constraint of $(\varepsilon, \delta)$-differential privacy. First, we provide tight lower bounds for private covariance estimation of Gaussian distributions. We show that estimating the covariance matrix in Frobenius norm requires $\Omega(d^2)$ samples, and in spectral norm requires $\Omega(d^{3/2})$ samples, both matching upper bounds up to logarithmic factors. We prove these bounds via our main technical contribution, a broad generalization of the fingerprinting method to exponential families. Additionally, using the private Assouad method of Acharya, Sun, and Zhang, we show a tight $\Omega(d/(\alpha^2 \varepsilon))$ lower bound for estimating the mean of a distribution with bounded covariance to $\alpha$-error in $\ell_2$-distance. Prior known lower bounds for all these problems were either polynomially weaker or held under the stricter condition of $(\varepsilon,0)$-differential privacy.
We study the relationship between adversarial robustness and differential privacy in high-dimensional algorithmic statistics. We give the first black-box reduction from privacy to robustness which can produce private estimators with optimal tradeoffs among sample complexity, accuracy, and privacy for a wide range of fundamental high-dimensional parameter estimation problems, including mean and covariance estimation. We show that this reduction can be implemented in polynomial time in some important special cases. In particular, using nearly-optimal polynomial-time robust estimators for the mean and covariance of high-dimensional Gaussians which are based on the Sum-of-Squares method, we design the first polynomial-time private estimators for these problems with nearly-optimal samples-accuracy-privacy tradeoffs. Our algorithms are also robust to a constant fraction of adversarially-corrupted samples.
In a mixed generalized linear model, the objective is to learn multiple signals from unlabeled observations: each sample comes from exactly one signal, but it is not known which one. We consider the prototypical problem of estimating two statistically independent signals in a mixed generalized linear model with Gaussian covariates. Spectral methods are a popular class of estimators which output the top two eigenvectors of a suitable data-dependent matrix. However, despite the wide applicability, their design is still obtained via heuristic considerations, and the number of samples $n$ needed to guarantee recovery is super-linear in the signal dimension $d$. In this paper, we develop exact asymptotics on spectral methods in the challenging proportional regime in which $n, d$ grow large and their ratio converges to a finite constant. By doing so, we are able to optimize the design of the spectral method, and combine it with a simple linear estimator, in order to minimize the estimation error. Our characterization exploits a mix of tools from random matrices, free probability and the theory of approximate message passing algorithms. Numerical simulations for mixed linear regression and phase retrieval display the advantage enabled by our analysis over existing designs of spectral methods.
We present new methods for assessing the privacy guarantees of an algorithm with regard to R\'enyi Differential Privacy. To the best of our knowledge, this work is the first to address this problem in a black-box scenario, where only algorithmic outputs are available. To quantify privacy leakage, we devise a new estimator for the R\'enyi divergence of a pair of output distributions. This estimator is transformed into a statistical lower bound that is proven to hold for large samples with high probability. Our method is applicable for a broad class of algorithms, including many well-known examples from the privacy literature. We demonstrate the effectiveness of our approach by experiments encompassing algorithms and privacy enhancing methods that have not been considered in related works.
We investigate the problem of recovering a partially observed high-rank matrix whose columns obey a nonlinear structure such as a union of subspaces, an algebraic variety or grouped in clusters. The recovery problem is formulated as the rank minimization of a nonlinear feature map applied to the original matrix, which is then further approximated by a constrained non-convex optimization problem involving the Grassmann manifold. We propose two sets of algorithms, one arising from Riemannian optimization and the other as an alternating minimization scheme, both of which include first- and second-order variants. Both sets of algorithms have theoretical guarantees. In particular, for the alternating minimization, we establish global convergence and worst-case complexity bounds. Additionally, using the Kurdyka-Lojasiewicz property, we show that the alternating minimization converges to a unique limit point. We provide extensive numerical results for the recovery of union of subspaces and clustering under entry sampling and dense Gaussian sampling. Our methods are competitive with existing approaches and, in particular, high accuracy is achieved in the recovery using Riemannian second-order methods.
We show that there is no $2^{o(k^2)} n^{O(1)}$ time algorithm for Independent Set on $n$-vertex graphs with rank-width $k$, unless the Exponential Time Hypothesis (ETH) fails. Our lower bound matches the $2^{O(k^2)} n^{O(1)}$ time algorithm given by Bui-Xuan, Telle, and Vatshelle [Discret. Appl. Math., 2010] and it answers the open question of Bergougnoux and Kant\'{e} [SIAM J. Discret. Math., 2021]. We also show that the known $2^{O(k^2)} n^{O(1)}$ time algorithms for Weighted Dominating Set, Maximum Induced Matching and Feedback Vertex Set parameterized by rank-width $k$ are optimal assuming ETH. Our results are the first tight ETH lower bounds parameterized by rank-width that do not follow directly from lower bounds for $n$-vertex graphs.
Quantum algorithms for optimization problems are of general interest. Despite recent progress in classical lower bounds for nonconvex optimization under different settings and quantum lower bounds for convex optimization, quantum lower bounds for nonconvex optimization are still widely open. In this paper, we conduct a systematic study of quantum query lower bounds on finding $\epsilon$-approximate stationary points of nonconvex functions, and we consider the following two important settings: 1) having access to $p$-th order derivatives; or 2) having access to stochastic gradients. The classical query lower bounds is $\Omega\big(\epsilon^{-\frac{1+p}{p}}\big)$ regarding the first setting, and $\Omega(\epsilon^{-4})$ regarding the second setting (or $\Omega(\epsilon^{-3})$ if the stochastic gradient function is mean-squared smooth). In this paper, we extend all these classical lower bounds to the quantum setting. They match the classical algorithmic results respectively, demonstrating that there is no quantum speedup for finding $\epsilon$-stationary points of nonconvex functions with $p$-th order derivative inputs or stochastic gradient inputs, whether with or without the mean-squared smoothness assumption. Technically, our quantum lower bounds are obtained by showing that the sequential nature of classical hard instances in all these settings also applies to quantum queries, preventing any quantum speedup other than revealing information of the stationary points sequentially.
We investigate an anisotropic weakly over-penalised symmetric interior penalty method for the Stokes equation. The method is one of the discontinuous Galerkin methods, simple and similar to the Crouzeix--Raviart finite element method. The main contributions of this paper are to show new proof for the consistency term. It allows us to obtain an anisotropic consistency error estimate. The idea of the proof is to use the relation between the Raviart--Thomas finite element space and the discontinuous space. In many papers, the inf-sup stable schemes of the discontinuous Galerkin method have been discussed on shape-regular mesh partitions. Our result shows that the Stokes pair, which is treated in this paper, satisfies the inf-sup condition on anisotropic meshes. Furthermore, we show an error estimate in an energy norm on anisotropic meshes. In numerical experiments, we have compared the calculation results for standard and anisotropic mesh partitions. The effectiveness of using anisotropic meshes can be confirmed for problems with boundary layers.
Popular iterative algorithms such as boosting methods and coordinate descent on linear models converge to the maximum $\ell_1$-margin classifier, a.k.a. sparse hard-margin SVM, in high dimensional regimes where the data is linearly separable. Previous works consistently show that many estimators relying on the $\ell_1$-norm achieve improved statistical rates for hard sparse ground truths. We show that surprisingly, this adaptivity does not apply to the maximum $\ell_1$-margin classifier for a standard discriminative setting. In particular, for the noiseless setting, we prove tight upper and lower bounds for the prediction error that match existing rates of order $\frac{\|\wgt\|_1^{2/3}}{n^{1/3}}$ for general ground truths. To complete the picture, we show that when interpolating noisy observations, the error vanishes at a rate of order $\frac{1}{\sqrt{\log(d/n)}}$. We are therefore first to show benign overfitting for the maximum $\ell_1$-margin classifier.
The optimal error estimate that depending only on the polynomial degree of $ \varepsilon^{-1}$ is established for the temporal semi-discrete scheme of the Cahn-Hilliard equation, which is based on the scalar auxiliary variable (SAV) formulation. The key to our analysis is to convert the structure of the SAV time-stepping scheme back to a form compatible with the original format of the Cahn-Hilliard equation, which makes it feasible to use spectral estimates to handle the nonlinear term. Based on the transformation of the SAV numerical scheme, the optimal error estimate for the temporal semi-discrete scheme which depends only on the low polynomial order of $\varepsilon^{-1}$ instead of the exponential order, is derived by using mathematical induction, spectral arguments, and the superconvergence properties of some nonlinear terms. Numerical examples are provided to illustrate the discrete energy decay property and validate our theoretical convergence analysis.
We propose a new wavelet-based method for density estimation when the data are size-biased. More specifically, we consider a power of the density of interest, where this power exceeds 1/2. Warped wavelet bases are employed, where warping is attained by some continuous cumulative distribution function. A special case is the conventional orthonormal wavelet estimation, where the warping distribution is the standard continuous uniform. We show that both linear and nonlinear wavelet estimators are consistent, with optimal and/or near-optimal rates. Monte Carlo simulations are performed to compare four special settings which are easy to interpret in practice. An application with a real dataset on fatal traffic accidents involving alcohol illustrates the method. We observe that warped bases provide more flexible and superior estimates for both simulated and real data. Moreover, we find that estimating the power of a density (for instance, its square root) further improves the results.