We prove new lower bounds for statistical estimation tasks under the constraint of $(\varepsilon, \delta)$-differential privacy. First, we provide tight lower bounds for private covariance estimation of Gaussian distributions. We show that estimating the covariance matrix in Frobenius norm requires $\Omega(d^2)$ samples, and in spectral norm requires $\Omega(d^{3/2})$ samples, both matching upper bounds up to logarithmic factors. The latter bound verifies the existence of a conjectured statistical gap between the private and the non-private sample complexities for spectral estimation of Gaussian covariances. We prove these bounds via our main technical contribution, a broad generalization of the fingerprinting method to exponential families. Additionally, using the private Assouad method of Acharya, Sun, and Zhang, we show a tight $\Omega(d/(\alpha^2 \varepsilon))$ lower bound for estimating the mean of a distribution with bounded covariance to $\alpha$-error in $\ell_2$-distance. Prior known lower bounds for all these problems were either polynomially weaker or held under the stricter condition of $(\varepsilon,0)$-differential privacy.
We consider a standard two-source model for uniform common randomness (UCR) generation, in which Alice and Bob observe independent and identically distributed (i.i.d.) samples of a correlated finite source and where Alice is allowed to send information to Bob over an arbitrary single-user channel. We study the \(\boldsymbol{\epsilon}\)-UCR capacity for the proposed model, defined as the maximum common randomness rate one can achieve such that the probability that Alice and Bob do not agree on a common uniform or nearly uniform random variable does not exceed \(\boldsymbol{\epsilon}.\) We establish a lower and an upper bound on the \(\boldsymbol{\epsilon}\)-UCR capacity using the bounds on the \(\boldsymbol{\epsilon}\)-transmission capacity proved by Verd\'u and Han for arbitrary point-to-point channels.
We investigate random processes for generating task-dependency graphs of order $n$ with $m$ edges and a specified number of initial vertices and terminal vertices. In order to do so, we consider two random processes for generating task-dependency graphs that can be combined to accomplish this task. In the $(x, y)$ edge-removal process, we start with a maximally connected task-dependency graph and remove edges uniformly at random as long as they do not cause the number of initial vertices to exceed $x$ or the number of terminal vertices to exceed $y$. In the $(x, y)$ edge-addition process, we start with an empty task-dependency graph and add edges uniformly at random as long as they do not cause the number of initial vertices to be less than $x$ or the number of terminal vertices to be less than $y$. In the $(x, y)$ edge-addition process, we halt if there are exactly $x$ initial vertices and $y$ terminal vertices. For both processes, we determine the values of $x$ and $y$ for which the resulting task-dependency graph is guaranteed to have exactly $x$ initial vertices and $y$ terminal vertices, and we also find the extremal values for the number of edges in the resulting task-dependency graphs as a function of $x$, $y$, and the number of vertices. Furthermore, we asymptotically bound the expected number of edges in the resulting task-dependency graphs. Finally, we define a random process using only edge-addition and edge-removal, and we show that with high probability this random process generates an $(x, y)$ task-dependency graph of order $n$ with $m$ edges.
A random algebraic graph is defined by a group $G$ with a uniform distribution over it and a connection $\sigma:G\longrightarrow[0,1]$ with expectation $p,$ satisfying $\sigma(g)=\sigma(g^{-1}).$ The random graph $\mathsf{RAG}(n,G,p,\sigma)$ with vertex set $[n]$ is formed as follows. First, $n$ independent vectors $x_1,\ldots,x_n$ are sampled uniformly from $G.$ Then, vertices $i,j$ are connected with probability $\sigma(x_ix_j^{-1}).$ This model captures random geometric graphs over the sphere and the hypercube, certain regimes of the stochastic block model, and random subgraphs of Cayley graphs. The main question of interest to the current paper is: when is a random algebraic graph statistically and/or computationally distinguishable from $\mathsf{G}(n,p)$? Our results fall into two categories. 1) Geometric. We focus on the case $G =\{\pm1\}^d$ and use Fourier-analytic tools. For hard threshold connections, we match [LMSY22b] for $p = \omega(1/n)$ and for $1/(r\sqrt{d})$-Lipschitz connections we extend the results of [LR21b] when $d = \Omega(n\log n)$ to the non-monotone setting. We study other connections such as indicators of interval unions and low-degree polynomials. 2) Algebraic. We provide evidence for an exponential statistical-computational gap. Consider any finite group $G$ and let $A\subseteq G$ be a set of elements formed by including each set of the form $\{g, g^{-1}\}$ independently with probability $1/2.$ Let $\Gamma_n(G,A)$ be the distribution of random graphs formed by taking a uniformly random induced subgraph of size $n$ of the Cayley graph $\Gamma(G,A).$ Then, $\Gamma_n(G,A)$ and $\mathsf{G}(n,1/2)$ are statistically indistinguishable with high probability over $A$ if and only if $\log|G|\gtrsim n.$ However, low-degree polynomial tests fail to distinguish $\Gamma_n(G,A)$ and $\mathsf{G}(n,1/2)$ with high probability over $A$ when $\log |G|=\log^{\Omega(1)}n.$
Phase estimation, due to Kitaev [arXiv'95], is one of the most fundamental subroutines in quantum computing. In the basic scenario, one is given black-box access to a unitary $U$, and an eigenstate $\lvert \psi \rangle$ of $U$ with unknown eigenvalue $e^{i\theta}$, and the task is to estimate the eigenphase $\theta$ within $\pm\delta$, with high probability. The cost of an algorithm for us will be the number of applications of $U$ and $U^{-1}$. We tightly characterize the cost of several variants of phase estimation where we are no longer given an arbitrary eigenstate, but are required to estimate the maximum eigenphase of $U$, aided by advice in the form of states (or a unitary preparing those states) which are promised to have at least a certain overlap $\gamma$ with the top eigenspace. We give algorithms and matching lower bounds (up to logarithmic factors) for all ranges of parameters. We show that a small number of copies of the advice state (or of an advice-preparing unitary) are not significantly better than having no advice at all. We also show that having lots of advice (applications of the advice-preparing unitary) does not significantly reduce cost, and neither does knowledge of the eigenbasis of $U$. As an immediate consequence we obtain a lower bound on the complexity of the Unitary recurrence time problem, matching an upper bound of She and Yuen~[ITCS'23] and resolving one of their open questions. Lastly, we show that a phase-estimation algorithm with precision $\delta$ and error probability $\epsilon$ has cost $\Omega\left(\frac{1}{\delta}\log\frac{1}{\epsilon}\right)$, matching an easy upper bound. This contrasts with some other scenarios in quantum computing (e.g., search) where error-reduction costs only a factor $O(\sqrt{\log(1/\epsilon)})$. Our lower bound technique uses a variant of the polynomial method with trigonometric polynomials.
A class of stochastic Besov spaces $B^p L^2(\Omega;\dot H^\alpha(\mathcal{O}))$, $1\le p\le\infty$ and $\alpha\in[-2,2]$, is introduced to characterize the regularity of the noise in the semilinear stochastic heat equation \begin{equation*} {\rm d} u -\Delta u {\rm d} t =f(u) {\rm d} t + {\rm d} W(t) , \end{equation*} under the following conditions for some $\alpha\in(0,1]$: $$ \Big\| \int_0^te^{-(t-s)A}{\rm d} W(s) \Big\|_{L^2(\Omega;L^2(\mathcal{O}))} \le C t^{\frac{\alpha}{2}} \quad\mbox{and}\quad \Big\| \int_0^te^{-(t-s)A}{\rm d} W(s) \Big\|_{B^\infty L^2(\Omega;\dot H^\alpha(\mathcal{O}))}\le C. $$ The conditions above are shown to be satisfied by both trace-class noises (with $\alpha=1$) and one-dimensional space-time white noises (with $\alpha=\frac12$). The latter would fail to satisfy the conditions with $\alpha=\frac12$ if the stochastic Besov norm $\|\cdot\|_{B^\infty L^2(\Omega;\dot H^\alpha(\mathcal{O}))}$ is replaced by the classical Sobolev norm $\|\cdot\|_{L^2(\Omega;\dot H^\alpha(\mathcal{O}))}$, and this often causes reduction of the convergence order in the numerical analysis of the semilinear stochastic heat equation. In this article, the convergence of a modified exponential Euler method, with a spectral method for spatial discretization, is proved to have order $\alpha$ in both time and space for possibly nonsmooth initial data in $L^4(\Omega;\dot{H}^{\beta}(\mathcal{O}))$ with $\beta>-1$, by utilizing the real interpolation properties of the stochastic Besov spaces and a class of locally refined stepsizes to resolve the singularity of the solution at $t=0$.
We study multi-item profit maximization when there is an underlying distribution over buyers' values. In practice, a full description of the distribution is typically unavailable, so we study the setting where the mechanism designer only has samples from the distribution. If the designer uses the samples to optimize over a complex mechanism class -- such as the set of all multi-item, multi-buyer mechanisms -- a mechanism may have high average profit over the samples but low expected profit. This raises the central question of this paper: how many samples are sufficient to ensure that a mechanism's average profit is close to its expected profit? To answer this question, we uncover structure shared by many pricing, auction, and lottery mechanisms: for any set of buyers' values, profit is piecewise linear in the mechanism's parameters. Using this structure, we prove new bounds for mechanism classes not yet studied in the sample-based mechanism design literature and match or improve over the best-known guarantees for many classes.
Gun violence is a major problem in contemporary American society, with tens of thousands injured each year. However, relatively little is known about the effects on family members and how effects vary across subpopulations. To study these questions and, more generally, to address a gap in the causal inference literature, we present a framework for the study of effect modification or heterogeneous treatment effects in difference-in-differences designs. We implement a new matching technique, which combines profile matching and risk set matching, to (i) preserve the time alignment of covariates, exposure, and outcomes, avoiding pitfalls of other common approaches for difference-in-differences, and (ii) explicitly control biases due to imbalances in observed covariates in subgroups discovered from the data. Our case study shows significant and persistent effects of nonfatal firearm injuries on several health outcomes for those injured and on the mental health of their family members. Sensitivity analyses reveal that these results are moderately robust to unmeasured confounding bias. Finally, while the effects for those injured are modified largely by the severity of the injury and its documented intent, for families, effects are strongest for those whose relative's injury is documented as resulting from an assault, self-harm, or law enforcement intervention.
We give a simple characterization of which functions can be computed deterministically by anonymous processes in dynamic networks, depending on the number of leaders in the network. In addition, we provide efficient distributed algorithms for computing all such functions assuming minimal or no knowledge about the network. Each of our algorithms comes in two versions: one that terminates with the correct output and a faster one that stabilizes on the correct output without explicit termination. Notably, these are the first deterministic algorithms whose running times scale linearly with both the number of processes and a parameter of the network which we call "dynamic disconnectivity" (meaning that our dynamic networks do not necessarily have to be connected at all times). We also provide matching lower bounds, showing that all our algorithms are asymptotically optimal for any fixed number of leaders. While most of the existing literature on anonymous dynamic networks relies on classical mass-distribution techniques, our work makes use of a recently introduced combinatorial structure called "history tree", also developing its theory in new directions. Among other contributions, our results make definitive progress on two popular fundamental problems for anonymous dynamic networks: leaderless Average Consensus (i.e., computing the mean value of input numbers distributed among the processes) and multi-leader Counting (i.e., determining the exact number of processes in the network). In fact, our approach unifies and improves upon several independent lines of research on anonymous networks, including Nedic et al., IEEE Trans. Automat. Contr. 2009; Olshevsky, SIAM J. Control Optim. 2017; Kowalski-Mosteiro, ICALP 2019, SPAA 2021; Di Luna-Viglietta, FOCS 2022.
This paper investigates the multi-antenna covert communications assisted by a simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS). In particular, to shelter the existence of covert communications between a multi-antenna transmitter and a single-antenna receiver from a warden, a friendly full-duplex receiver with two antennas is leveraged to make contributions where one antenna is responsible for receiving the transmitted signals and the other one transmits the jamming signals with a varying power to confuse the warden. Considering the worst case, the closed-form expression of the minimum detection error probability (DEP) at the warden is derived and utilized in a covert constraint to guarantee the system performance. Then, we formulate an optimization problem maximizing the covert rate of the system under the covertness constraint and quality of service (QoS) constraint with communication outage analysis. To jointly design the active and passive beamforming of the transmitter and STAR-RIS, an iterative algorithm based on semi-definite relaxation (SDR) method and Dinkelbachs algorithm is proposed to effectively solve the non-convex optimization problem. Simulation results show that the proposed STAR-RIS-assisted scheme highly outperforms the case with conventional RIS, which validates the effectiveness of the proposed algorithm as well as the superiority of STAR-RIS in guaranteeing the covertness of wireless communications.
Motivated by an application from geodesy, we introduce a novel clustering problem which is a $k$-center (or k-diameter) problem with a side constraint. For the side constraint, we are given an undirected connectivity graph $G$ on the input points, and a clustering is now only feasible if every cluster induces a connected subgraph in $G$. We call the resulting problems the connected $k$-center problem and the connected $k$-diameter problem. We prove several results on the complexity and approximability of these problems. Our main result is an $O(\log^2{k})$-approximation algorithm for the connected $k$-center and the connected $k$-diameter problem. For Euclidean metrics and metrics with constant doubling dimension, the approximation factor of this algorithm improves to $O(1)$. We also consider the special cases that the connectivity graph is a line or a tree. For the line we give optimal polynomial-time algorithms and for the case that the connectivity graph is a tree, we either give an optimal polynomial-time algorithm or a $2$-approximation algorithm for all variants of our model. We complement our upper bounds by several lower bounds.