We present a new quantum algorithm for estimating the mean of a real-valued random variable obtained as the output of a quantum computation. Our estimator achieves a nearly-optimal quadratic speedup over the number of classical i.i.d. samples needed to estimate the mean of a heavy-tailed distribution with a sub-Gaussian error rate. This result subsumes (up to logarithmic factors) earlier works on the mean estimation problem that were not optimal for heavy-tailed distributions [BHMT02,BDGT11], or that require prior information on the variance [Hein02,Mon15,HM19]. As an application, we obtain new quantum algorithms for the $(\epsilon,\delta)$-approximation problem with an optimal dependence on the coefficient of variation of the input random variable.
We study quantum algorithms for several fundamental string problems, including Longest Common Substring, Lexicographically Minimal String Rotation, and Longest Square Substring. These problems have been widely studied in the stringology literature since the 1970s, and are known to be solvable by near-linear time classical algorithms. In this work, we give quantum algorithms for these problems with near-optimal query complexities and time complexities. Specifically, we show that: - Longest Common Substring can be solved by a quantum algorithm in $\tilde O(n^{2/3})$ time, improving upon the recent $\tilde O(n^{5/6})$-time algorithm by Le Gall and Seddighin (2020). Our algorithm uses the MNRS quantum walk framework, together with a careful combination of string synchronizing sets (Kempa and Kociumaka, 2019) and generalized difference covers. - Lexicographically Minimal String Rotation can be solved by a quantum algorithm in $n^{1/2 + o(1)}$ time, improving upon the recent $\tilde O(n^{3/4})$-time algorithm by Wang and Ying (2020). We design our algorithm by first giving a new classical divide-and-conquer algorithm in near-linear time based on exclusion rules, and then speeding it up quadratically using nested Grover search and quantum minimum finding. - Longest Square Substring can be solved by a quantum algorithm in $\tilde O(\sqrt{n})$ time. Our algorithm is an adaptation of the algorithm by Le Gall and Seddighin (2020) for the Longest Palindromic Substring problem, but uses additional techniques to overcome the difficulty that binary search no longer applies. Our techniques naturally extend to other related string problems, such as Longest Repeated Substring, Longest Lyndon Substring, and Minimal Suffix.
We study Tikhonov regularization for possibly nonlinear inverse problems with weighted $\ell^1$-penalization. The forward operator, mapping from a sequence space to an arbitrary Banach space, typically an $L^2$-space, is assumed to satisfy a two-sided Lipschitz condition with respect to a weighted $\ell^2$-norm and the norm of the image space. We show that in this setting approximation rates of arbitrarily high H\"older-type order in the regularization parameter can be achieved, and we characterize maximal subspaces of sequences on which these rates are attained. On these subspaces the method also converges with optimal rates in terms of the noise level with the discrepancy principle as parameter choice rule. Our analysis includes the case that the penalty term is not finite at the exact solution ('oversmoothing'). As a standard example we discuss wavelet regularization in Besov spaces $B^r_{1,1}$. In this setting we demonstrate in numerical simulations for a parameter identification problem in a differential equation that our theoretical results correctly predict improved rates of convergence for piecewise smooth unknown coefficients.
Bootstrap is a principled and powerful frequentist statistical tool for uncertainty quantification. Unfortunately, standard bootstrap methods are computationally intensive due to the need of drawing a large i.i.d. bootstrap sample to approximate the ideal bootstrap distribution; this largely hinders their application in large-scale machine learning, especially deep learning problems. In this work, we propose an efficient method to explicitly \emph{optimize} a small set of high quality "centroid" points to better approximate the ideal bootstrap distribution. We achieve this by minimizing a simple objective function that is asymptotically equivalent to the Wasserstein distance to the ideal bootstrap distribution. This allows us to provide an accurate estimation of uncertainty with a small number of bootstrap centroids, outperforming the naive i.i.d. sampling approach. Empirically, we show that our method can boost the performance of bootstrap in a variety of applications.
Given independent identically-distributed samples from a one-dimensional distribution, IAs are formed by partitioning samples into pairs, triplets, or nth-order groupings and retaining the median of those groupings that are approximately equal. A new statistical method, Independent Approximates (IAs), is defined and proven to enable closed-form estimation of the parameters of heavy-tailed distributions. The pdf of the IAs is proven to be the normalized nth-power of the original density. From this property, heavy-tailed distributions are proven to have well-defined means for their IA pairs, finite second moments for their IA triplets, and a finite, well-defined (n-1)th-moment for the nth-grouping. Estimation of the location, scale, and shape (inverse of degree of freedom) of the generalized Pareto and Student's t distributions are possible via a system of three equations. Performance analysis of the IA estimation methodology is conducted for the Student's t distribution using between 1000 to 100,000 samples. Closed-form estimates of the location and scale are determined from the mean of the IA pairs and the variance of the IA triplets, respectively. For the Student's t distribution, the geometric mean of the original samples provides a third equation to determine the shape, though its nonlinear solution requires an iterative solver. With 10,000 samples the relative bias of the parameter estimates is less than 0.01 and the relative precision is less than +/-0.1. The theoretical precision is finite for a limited range of the shape but can be extended by using higher-order groupings for a given moment.
Nonlinear differential equations model diverse phenomena but are notoriously difficult to solve. While there has been extensive previous work on efficient quantum algorithms for linear differential equations, the linearity of quantum mechanics has limited analogous progress for the nonlinear case. Despite this obstacle, we develop a quantum algorithm for dissipative quadratic $n$-dimensional ordinary differential equations. Assuming $R < 1$, where $R$ is a parameter characterizing the ratio of the nonlinearity and forcing to the linear dissipation, this algorithm has complexity $T^2 q~\mathrm{poly}(\log T, \log n, \log 1/\epsilon)/\epsilon$, where $T$ is the evolution time, $\epsilon$ is the allowed error, and $q$ measures decay of the solution. This is an exponential improvement over the best previous quantum algorithms, whose complexity is exponential in $T$. While exponential decay precludes efficiency, driven equations can avoid this issue despite the presence of dissipation. Our algorithm uses the method of Carleman linearization, for which we give a novel convergence theorem. This method maps a system of nonlinear differential equations to an infinite-dimensional system of linear differential equations, which we discretize, truncate, and solve using the forward Euler method and the quantum linear system algorithm. We also provide a lower bound on the worst-case complexity of quantum algorithms for general quadratic differential equations, showing that the problem is intractable for $R \ge \sqrt{2}$. Finally, we discuss potential applications, showing that the $R < 1$ condition can be satisfied in realistic epidemiological models and giving numerical evidence that the method may describe a model of fluid dynamics even for larger values of $R$.
Thanks to their ability to capture complex dependence structures, copulas are frequently used to glue random variables into a joint model with arbitrary marginal distributions. More recently, they have been applied to solve statistical learning problems such as regression or classification. Framing such approaches as solutions of estimating equations, we generalize them in a unified framework. We can then obtain simultaneous, coherent inferences across multiple regression-like problems. We derive consistency, asymptotic normality, and validity of the bootstrap for corresponding estimators. The conditions allow for both continuous and discrete data as well as parametric, nonparametric, and semiparametric estimators of the copula and marginal distributions. The versatility of this methodology is illustrated by several theoretical examples, a simulation study, and an application to financial portfolio allocation.
We address the problem of causal effect estimation in the presence of unobserved confounding, but where proxies for the latent confounder(s) are observed. We propose two kernel-based methods for nonlinear causal effect estimation in this setting: (a) a two-stage regression approach, and (b) a maximum moment restriction approach. We focus on the proximal causal learning setting, but our methods can be used to solve a wider class of inverse problems characterised by a Fredholm integral equation. In particular, we provide a unifying view of two-stage and moment restriction approaches for solving this problem in a nonlinear setting. We provide consistency guarantees for each algorithm, and we demonstrate these approaches achieve competitive results on synthetic data and data simulating a real-world task. In particular, our approach outperforms earlier methods that are not suited to leveraging proxy variables.
We study the relationship between the Quantum Approximate Optimization Algorithm (QAOA) and the underlying symmetries of the objective function to be optimized. Our approach formalizes the connection between quantum symmetry properties of the QAOA dynamics and the group of classical symmetries of the objective function. The connection is general and includes but is not limited to problems defined on graphs. We show a series of results exploring the connection and highlight examples of hard problem classes where a nontrivial symmetry subgroup can be obtained efficiently. In particular we show how classical objective function symmetries lead to invariant measurement outcome probabilities across states connected by such symmetries, independent of the choice of algorithm parameters or number of layers. To illustrate the power of the developed connection, we apply machine learning techniques towards predicting QAOA performance based on symmetry considerations. We provide numerical evidence that a small set of graph symmetry properties suffices to predict the minimum QAOA depth required to achieve a target approximation ratio on the MaxCut problem, in a practically important setting where QAOA parameter schedules are constrained to be linear and hence easier to optimize.
Conditional density estimation (CDE) models can be useful for many statistical applications, especially because the full conditional density is estimated instead of traditional regression point estimates, revealing more information about the uncertainty of the random variable of interest. In this paper, we propose a new methodology called Odds Conditional Density Estimator (OCDE) to estimate conditional densities in a supervised learning scheme. The main idea is that it is very difficult to estimate $p_{x,y}$ and $p_{x}$ in order to estimate the conditional density $p_{y|x}$, but by introducing an instrumental distribution, we transform the CDE problem into a problem of odds estimation, or similarly, training a binary probabilistic classifier. We demonstrate how OCDE works using simulated data and then test its performance against other known state-of-the-art CDE methods in real data. Overall, OCDE is competitive compared with these methods in real datasets.
From only positive (P) and unlabeled (U) data, a binary classifier could be trained with PU learning, in which the state of the art is unbiased PU learning. However, if its model is very flexible, empirical risks on training data will go negative, and we will suffer from serious overfitting. In this paper, we propose a non-negative risk estimator for PU learning: when getting minimized, it is more robust against overfitting, and thus we are able to use very flexible models (such as deep neural networks) given limited P data. Moreover, we analyze the bias, consistency, and mean-squared-error reduction of the proposed risk estimator, and bound the estimation error of the resulting empirical risk minimizer. Experiments demonstrate that our risk estimator fixes the overfitting problem of its unbiased counterparts.