A long-standing open question in the algorithms and complexity literature is whether there exist sorting circuits of size $o(n \log n)$. A recent work by Asharov, Lin, and Shi (SODA'21) showed that if the elements to be sorted have short keys whose length $k = o(\log n)$, then one can indeed overcome the $n\log n$ barrier for sorting circuits, by leveraging non-comparison-based techniques. More specifically, Asharov et al.~showed that there exist $O(n) \cdot \min(k, \log n)$-sized sorting circuits for $k$-bit keys, ignoring $poly\log^*$ factors. Interestingly, the recent works by Farhadi et al. (STOC'19) and Asharov et al. (SODA'21) also showed that the above result is essentially optimal for every key length $k$, assuming that the famous Li-Li network coding conjecture holds. Note also that proving any {\it unconditional} super-linear circuit lower bound for a wide class of problems is beyond the reach of current techniques. Unfortunately, the approach taken by Asharov et al.~to achieve optimality in size somewhat crucially relies on sacrificing the depth: specifically, their circuit is super-{\it poly}logarithmic in depth even for 1-bit keys. Asharov et al.~phrase it as an open question how to achieve optimality both in size and depth. In this paper, we close this important gap in our understanding. We construct a sorting circuit of size $O(n) \cdot \min(k, \log n)$ (ignoring $poly\log^*$ terms) and depth $O(\log n)$. To achieve this, our approach departs significantly from the prior works. Our result can be viewed as a generalization of the landmark result by Ajtai, Koml\'os, and Szemer\'edi (STOC'83), simultaneously in terms of size and depth. Specifically, for $k = o(\log n)$, we achieve asymptotical improvements in size over the AKS sorting circuit, while preserving optimality in depth.
In the $(1+\varepsilon,r)$-approximate near-neighbor problem for curves (ANNC) under some distance measure $\delta$, the goal is to construct a data structure for a given set $\mathcal{C}$ of curves that supports approximate near-neighbor queries: Given a query curve $Q$, if there exists a curve $C\in\mathcal{C}$ such that $\delta(Q,C)\le r$, then return a curve $C'\in\mathcal{C}$ with $\delta(Q,C')\le(1+\varepsilon)r$. There exists an efficient reduction from the $(1+\varepsilon)$-approximate nearest-neighbor problem to ANNC, where in the former problem the answer to a query is a curve $C\in\mathcal{C}$ with $\delta(Q,C)\le(1+\varepsilon)\cdot\delta(Q,C^*)$, where $C^*$ is the curve of $\mathcal{C}$ closest to $Q$. Given a set $\mathcal{C}$ of $n$ curves, each consisting of $m$ points in $d$ dimensions, we construct a data structure for ANNC that uses $n\cdot O(\frac{1}{\varepsilon})^{md}$ storage space and has $O(md)$ query time (for a query curve of length $m$), where the similarity between two curves is their discrete Fr\'echet or dynamic time warping distance. Our method is simple to implement, deterministic, and results in an exponential improvement in both query time and storage space compared to all previous bounds. Further, we also consider the asymmetric version of ANNC, where the length of the query curves is $k \ll m$, and obtain essentially the same storage and query bounds as above, except that $m$ is replaced by $k$. Finally, we apply our method to a version of approximate range counting for curves and achieve similar bounds.
We study the parameterized complexity of various classic vertex-deletion problems such as Odd cycle transversal, Vertex planarization, and Chordal vertex deletion under hybrid parameterizations. Existing FPT algorithms for these problems either focus on the parameterization by solution size, detecting solutions of size $k$ in time $f(k) \cdot n^{O(1)}$, or width parameterizations, finding arbitrarily large optimal solutions in time $f(w) \cdot n^{O(1)}$ for some width measure $w$ like treewidth. We unify these lines of research by presenting FPT algorithms for parameterizations that can simultaneously be arbitrarily much smaller than the solution size and the treewidth. We consider two classes of parameterizations which are relaxations of either treedepth of treewidth. They are related to graph decompositions in which subgraphs that belong to a target class H (e.g., bipartite or planar) are considered simple. First, we present a framework for computing approximately optimal decompositions for miscellaneous classes H. Namely, if the cost of an optimal decomposition is $k$, we show how to find a decomposition of cost $k^{O(1)}$ in time $f(k) \cdot n^{O(1)}$. This is applicable to any graph class H for which the corresponding vertex-deletion problem admits a constant-factor approximation algorithm or an FPT algorithm paramaterized by the solution size. Secondly, we exploit the constructed decompositions for solving vertex-deletion problems by extending ideas from algorithms using iterative compression and the finite state property. For the three mentioned vertex-deletion problems, and all problems which can be formulated as hitting a finite set of connected forbidden (a) minors or (b) (induced) subgraphs, we obtain FPT algorithms with respect to both studied parameterizations.
In this paper we apply methods originated in Complexity theory to some problems of Approximation. We notice that the construction of Alman and Williams that disproves the rigidity of Walsh-Hadamard matrices, provides good $\ell_p$-approximation for $p<2$. It follows that the first $n$ functions of Walsh system can be approximated with an error $n^{-\delta}$ by a linear space of dimension $n^{1-\delta}$: $$ d_{n^{1-\delta}}(\{w_1,\ldots,w_n\}, L_p[0,1]) \le n^{-\delta},\quad p\in[1,2),\;\delta=\delta(p)>0. $$ We do not know if this is possible for the trigonometric system. We show that the algebraic method of Alon--Frankl--R\"odl for bounding the number of low-signum-rank matrices, works for tensors: almost all signum-tensors have large signum-rank and can't be $\ell_1$-approximated by low-rank tensors. This implies lower bounds for $\Theta_m$~ -- the error of $m$-term approximation of multivariate functions by sums of tensor products $u^1(x_1)\cdots u^d(x_d)$. In particular, for the set of trigonometric polynomials with spectrum in $\prod_{j=1}^d[-n_j,n_j]$ and of norm $\|t\|_\infty\le 1$ we have $$ \Theta_m(\mathcal T(n_1,\ldots,n_d)_\infty,L_1[-\pi,\pi]^d) \ge c_1(d)>0,\quad m\le c_2(d)\frac{\prod n_j}{\max\{n_j\}}. $$ Sharp bounds follow for classes of dominated mixed smoothness: $$ \Theta_m(W^{(r,r,\ldots,r)}_p,L_q[0,1]^d)\asymp m^{-\frac{rd}{d-1}},\quad\mbox 2\le p\le\infty,\; 1\le q\le 2. $$
The possibilities offered by quantum computing have drawn attention in the distributed computing community recently, with several breakthrough results showing quantum distributed algorithms that run faster than the fastest known classical counterparts, and even separations between the two models. A prime example is the result by Izumi, Le Gall, and Magniez [STACS 2020], who showed that triangle detection by quantum distributed algorithms is easier than triangle listing, while an analogous result is not known in the classical case. In this paper we present a framework for fast quantum distributed clique detection. This improves upon the state-of-the-art for the triangle case, and is also more general, applying to larger clique sizes. Our main technical contribution is a new approach for detecting cliques by encapsulating this as a search task for nodes that can be added to smaller cliques. To extract the best complexities out of our approach, we develop a framework for nested distributed quantum searches, which employ checking procedures that are quantum themselves. Moreover, we show a circuit-complexity barrier on proving a lower bound of the form $\Omega(n^{3/5+\epsilon})$ for $K_p$-detection for any $p \geq 4$, even in the classical (non-quantum) distributed CONGEST setting.
We propose protocols for obliviously evaluating finite-state machines, i.e., the evaluation is shared between the provider of the finite-state machine and the provider of the input string in such a manner that neither party learns the other's input, and the states being visited are hidden from both. For alphabet size $|\Sigma|$, number of states $|Q|$, and input length $n$, previous solutions have either required a number of rounds linear in $n$ or communication $\Omega(n|\Sigma||Q|\log|Q|)$. Our solutions require 2 rounds with communication $O(n(|\Sigma|+|Q|\log|Q|))$. We present two different solutions to this problem, a two-party one and a setting with an untrusted but non-colluding helper.
We present new scalar and matrix Chernoff-style concentration bounds for a broad class of probability distributions over the binary hypercube $\{0,1\}^n$. Motivated by recent tools developed for the study of mixing times of Markov chains on discrete distributions, we say that a distribution is $\ell_\infty$-independent when the infinity norm of its influence matrix $\mathcal{I}$ is bounded by a constant. We show that any distribution which is $\ell_\infty$-independent satisfies a matrix Chernoff bound that matches the matrix Chernoff bound for independent random variables due to Tropp. Our matrix Chernoff bound is a broad generalization and strengthening of the matrix Chernoff bound of Kyng and Song (FOCS'18). Using our bound, we can conclude as a corollary that a union of $O(\log|V|)$ random spanning trees gives a spectral graph sparsifier of a graph with $|V|$ vertices with high probability, matching results for independent edge sampling, and matching lower bounds from Kyng and Song.
The aim of this thesis is to develop a theoretical framework to study parameter estimation of quantum channels. We study the task of estimating unknown parameters encoded in a channel in the sequential setting. A sequential strategy is the most general way to use a channel multiple times. Our goal is to establish lower bounds (called Cramer-Rao bounds) on the estimation error. The bounds we develop are universally applicable; i.e., they apply to all permissible quantum dynamics. We consider the use of catalysts to enhance the power of a channel estimation strategy. This is termed amortization. The power of a channel for a parameter estimation is determined by its Fisher information. Thus, we study how much a catalyst quantum state can enhance the Fisher information of a channel by defining the amortized Fisher information. We establish our bounds by proving that for certain Fisher information quantities, catalyst states do not improve the performance of a sequential estimation protocol compared to a parallel one. The technical term for this is an amortization collapse. We use this to establish bounds when estimating one parameter, or multiple parameters simultaneously. Our bounds apply universally and we also cast them as optimization problems. For the single parameter case, we establish bounds for general quantum channels using both the symmetric logarithmic derivative (SLD) Fisher information and the right logarithmic derivative (RLD) Fisher information. The task of estimating multiple parameters simultaneously is more involved than the single parameter case, because the Cramer-Rao bounds take the form of matrix inequalities. We establish a scalar Cramer-Rao bound for multiparameter channel estimation using the RLD Fisher information. For both single and multiparameter estimation, we provide a no-go condition for the so-called Heisenberg scaling using our RLD-based bound.
In the paper, we propose a class of accelerated zeroth-order and first-order momentum methods for both nonconvex mini-optimization and minimax-optimization. Specifically, we propose a new accelerated zeroth-order momentum (Acc-ZOM) method for black-box mini-optimization. Moreover, we prove that our Acc-ZOM method achieves a lower query complexity of $\tilde{O}(d^{3/4}\epsilon^{-3})$ for finding an $\epsilon$-stationary point, which improves the best known result by a factor of $O(d^{1/4})$ where $d$ denotes the variable dimension. In particular, the Acc-ZOM does not require large batches required in the existing zeroth-order stochastic algorithms. Meanwhile, we propose an accelerated \textbf{zeroth-order} momentum descent ascent (Acc-ZOMDA) method for \textbf{black-box} minimax-optimization, which obtains a query complexity of $\tilde{O}((d_1+d_2)^{3/4}\kappa_y^{4.5}\epsilon^{-3})$ without large batches for finding an $\epsilon$-stationary point, where $d_1$ and $d_2$ denote variable dimensions and $\kappa_y$ is condition number. Moreover, we propose an accelerated \textbf{first-order} momentum descent ascent (Acc-MDA) method for \textbf{white-box} minimax optimization, which has a gradient complexity of $\tilde{O}(\kappa_y^{4.5}\epsilon^{-3})$ without large batches for finding an $\epsilon$-stationary point. In particular, our Acc-MDA can obtain a lower gradient complexity of $\tilde{O}(\kappa_y^{2.5}\epsilon^{-3})$ with a batch size $O(\kappa_y^4)$. Extensive experimental results on the black-box adversarial attack to deep neural networks (DNNs) and poisoning attack demonstrate efficiency of our algorithms.
The gradient noise of Stochastic Gradient Descent (SGD) is considered to play a key role in its properties (e.g. escaping low potential points and regularization). Past research has indicated that the covariance of the SGD error done via minibatching plays a critical role in determining its regularization and escape from low potential points. It is however not much explored how much the distribution of the error influences the behavior of the algorithm. Motivated by some new research in this area, we prove universality results by showing that noise classes that have the same mean and covariance structure of SGD via minibatching have similar properties. We mainly consider the Multiplicative Stochastic Gradient Descent (M-SGD) algorithm as introduced by Wu et al., which has a much more general noise class than the SGD algorithm done via minibatching. We establish nonasymptotic bounds for the M-SGD algorithm mainly with respect to the Stochastic Differential Equation corresponding to SGD via minibatching. We also show that the M-SGD error is approximately a scaled Gaussian distribution with mean $0$ at any fixed point of the M-SGD algorithm. We also establish bounds for the convergence of the M-SGD algorithm in the strongly convex regime.
We show that for the problem of testing if a matrix $A \in F^{n \times n}$ has rank at most $d$, or requires changing an $\epsilon$-fraction of entries to have rank at most $d$, there is a non-adaptive query algorithm making $\widetilde{O}(d^2/\epsilon)$ queries. Our algorithm works for any field $F$. This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix. Our algorithm is the first such algorithm which does not read a submatrix, and instead reads a carefully selected non-adaptive pattern of entries in rows and columns of $A$. We complement our algorithm with a matching query complexity lower bound for non-adaptive testers over any field. We also give tight bounds of $\widetilde{\Theta}(d^2)$ queries in the sensing model for which query access comes in the form of $\langle X_i, A\rangle:=tr(X_i^\top A)$; perhaps surprisingly these bounds do not depend on $\epsilon$. We next develop a novel property testing framework for testing numerical properties of a real-valued matrix $A$ more generally, which includes the stable rank, Schatten-$p$ norms, and SVD entropy. Specifically, we propose a bounded entry model, where $A$ is required to have entries bounded by $1$ in absolute value. We give upper and lower bounds for a wide range of problems in this model, and discuss connections to the sensing model above.