The supremum of the standardized empirical process is a promising statistic for testing whether the distribution function $F$ of i.i.d. real random variables is either equal to a given distribution function $F_0$ (hypothesis) or $F \ge F_0$ (one-sided alternative). Since \cite{r5} it is well-known that an affine-linear transformation of the suprema converge in distribution to the Gumbel law as the sample size tends to infinity. This enables the construction of an asymptotic level-$\alpha$ test. However, the rate of convergence is extremely slow. As a consequence the probability of the type I error is much larger than $\alpha$ even for sample sizes beyond $10.000$. Now, the standardization consists of the weight-function $1/\sqrt{F_0(x)(1-F_0(x))}$. Substituting the weight-function by a suitable random constant leads to a new test-statistic, for which we can derive the exact distribution (and the limit distribution) under the hypothesis. A comparison via a Monte-Carlo simulation shows that the new test is uniformly better than the Smirnov-test and an appropriately modified test due to \cite{r20}. Our methodology also works for the two-sided alternative $F \neq F_0$.
Privacy-protecting data analysis investigates statistical methods under privacy constraints. This is a rising challenge in modern statistics, as the achievement of confidentiality guarantees, which typically occurs through suitable perturbations of the data, may determine a loss in the statistical utility of the data. In this paper, we consider privacy-protecting tests for goodness-of-fit in frequency tables, this being arguably the most common form of releasing data, and present a rigorous analysis of the large sample behaviour of a private likelihood-ratio (LR) test. Under the framework of $(\varepsilon,\delta)$-differential privacy for perturbed data, our main contribution is the power analysis of the private LR test, which characterizes the trade-off between confidentiality, measured via the differential privacy parameters $(\varepsilon,\delta)$, and statistical utility, measured via the power of the test. This is obtained through a Bahadur-Rao large deviation expansion for the power of the private LR test, bringing out a critical quantity, as a function of the sample size, the dimension of the table and $(\varepsilon,\delta)$, that determines a loss in the power of the test. Such a result is then applied to characterize the impact of the sample size and the dimension of the table, in connection with the parameters $(\varepsilon,\delta)$, on the loss of the power of the private LR test. In particular, we determine the (sample) cost of $(\varepsilon,\delta)$-differential privacy in the private LR test, namely the additional sample size that is required to recover the power of the Multinomial LR test in the absence of perturbation. Our power analysis rely on a non-standard large deviation analysis for the LR, as well as the development of a novel (sharp) large deviation principle for sum of i.i.d. random vectors, which is of independent interest.
In this paper we study the properties of the Lasso estimator of the drift component in the diffusion setting. More specifically, we consider a multivariate parametric diffusion model $X$ observed continuously over the interval $[0,T]$ and investigate drift estimation under sparsity constraints. We allow the dimensions of the model and the parameter space to be large. We obtain an oracle inequality for the Lasso estimator and derive an error bound for the $L^2$-distance using concentration inequalities for linear functionals of diffusion processes. The probabilistic part is based upon elements of empirical processes theory and, in particular, on the chaining method.
We study the rank of sub-matrices arising out of kernel functions, $F(\pmb{x},\pmb{y}): \mathbb{R}^d \times \mathbb{R}^d \mapsto \mathbb{R}$, where $\pmb{x},\pmb{y} \in \mathbb{R}^d$, that have a singularity along the diagonal $\pmb{x}=\pmb{y}$. Such kernel functions are frequently encountered in a wide range of applications such as $N$ body problems, Green's functions, integral equations, geostatistics, kriging, Gaussian processes, etc. One of the challenges in dealing with these kernel functions is that the corresponding matrix associated with these kernels is large and dense and thereby, the computational cost of matrix operations is high. In this article, we prove new theorems bounding the numerical rank of sub-matrices arising out of these kernel functions. Under reasonably mild assumptions, we prove that the rank of certain sub-matrices is rank-deficient in finite precision. This rank depends on the dimension of the ambient space and also on the type of interaction between the hyper-cubes containing the corresponding set of particles. This rank structure can be leveraged to reduce the computational cost of certain matrix operations such as matrix-vector products, solving linear systems, etc. We also present numerical results on the growth of rank of certain sub-matrices in $1$D, $2$D, $3$D and $4$D, which, not surprisingly, agrees with the theoretical results.
Solving Partially Observable Markov Decision Processes (POMDPs) with continuous actions is challenging, particularly for high-dimensional action spaces. To alleviate this difficulty, we propose a new sampling-based online POMDP solver, called Adaptive Discretization using Voronoi Trees (ADVT). It uses Monte Carlo Tree Search in combination with an adaptive discretization of the action space as well as optimistic optimization to efficiently sample high-dimensional continuous action spaces and compute the best action to perform. Specifically, we adaptively discretize the action space for each sampled belief using a hierarchical partition which we call a Voronoi tree. A Voronoi tree is a Binary Space Partitioning (BSP) that implicitly maintains the partition of a cell as the Voronoi diagram of two points sampled from the cell. This partitioning strategy keeps the cost of partitioning and estimating the size of each cell low, even in high-dimensional spaces where many sampled points are required to cover the space well. ADVT uses the estimated sizes of the cells to form an upper-confidence bound of the action values of the cell, and in turn uses the upper-confidence bound to guide the Monte Carlo Tree Search expansion and further discretization of the action space. This strategy enables ADVT to better exploit local information in the action space, leading to an action space discretization that is more adaptive, and hence more efficient in computing good POMDP solutions, compared to existing solvers. Experiments on simulations of four types of benchmark problems indicate that ADVT outperforms and scales substantially better to high-dimensional continuous action spaces, compared to state-of-the-art continuous action POMDP solvers.
In the quantum computation verification problem, a quantum server wants to convince a client that the output of evaluating a quantum circuit $C$ is some result that it claims. This problem is considered very important both theoretically and practically in quantum computation [arXiv:1709.06984], [arXiv:1704.04487], [arXiv:1209.0449]. The client is considered to be limited in computational power, and one desirable property is that the client can be completely classical, which leads to the classical verification of quantum computation (CVQC) problem. In terms of the total time complexity, the fastest single-server CVQC protocol so far has complexity $O(poly(\kappa)|C|^3)$ where $|C|$ is the size of the circuit to be verified and $\kappa$ is the security parameter, given by Mahadev [arXiv:1804.01082]. In this work, by developing new techniques, we give a new CVQC protocol with complexity $O(poly(\kappa)|C|)$, which is significantly faster than existing protocols. Our protocol is secure in the quantum random oracle model [arXiv:1008.0931] assuming the existence of noisy trapdoor claw-free functions [arXiv:1804.00640], which are both extensively used assumptions in quantum cryptography. Along the way, we also give a new classical channel remote state preparation protocol for states in $\{|+_\theta\rangle=\frac{1}{\sqrt{2}}(|0\rangle+e^{i\theta\pi/4}|1\rangle):\theta\in \{0,1\cdots 7\}\}$, another basic primitive in quantum cryptography. Our protocol allows for parallel verifiable preparation of $L$ independently random states in this form (up to a constant overall error and a possibly unbounded server-side simulator), and runs in only $O(poly(\kappa)L)$ time and constant rounds; for comparison, existing works (even for possibly simpler state families) all require very large or unestimated time and round complexities [arXiv:1904.06320][arXiv:1904.06303][arXiv:2201.13445][arXiv:2201.13430].
The comparison of local characteristics of two random processes can shed light on periods of time or space at which the processes differ the most. This paper proposes a method that learns about regions with a certain volume, where the marginal attributes of two processes are less similar. The proposed methods are devised in full generality for the setting where the data of interest are themselves stochastic processes, and thus the proposed method can be used for pointing out the regions of maximum dissimilarity with a certain volume, in the contexts of functional data, time series, and point processes. The parameter functions underlying both stochastic processes of interest are modeled via a basis representation, and Bayesian inference is conducted via an integrated nested Laplace approximation. The numerical studies validate the proposed methods, and we showcase their application with case studies on criminology, finance, and medicine.
We present efficient computational solutions to the problems of checking equality, performing multiplication, and computing minimal representatives of elements of free bands. A band is any semigroup satisfying the identity $x ^ 2 \approx x$ and the free band $\operatorname{FB}(k)$ is the free object in the variety of $k$-generated bands. Radoszewski and Rytter developed a linear time algorithm for checking whether two words represent the same element of a free band. In this paper we describe an alternate linear time algorithm for checking the same problem. The algorithm we present utilises a representation of words as synchronous deterministic transducers that lend themselves to efficient (quadratic in the size of the alphabet) multiplication in the free band. This representation also provides a means of finding the short-lex least word representing a given free band element with quadratic complexity.
In this work, we study the convergence and performance of nonlinear solvers for the Bidomain equations after decoupling the ordinary and partial differential equations of the cardiac system. We first rigorously prove that Quasi-Newton methods such as BFGS and nonlinear Conjugate-Gradient such as Fletcher-Reeves methods are globally convergent, by studying an auxiliary variational problem under physically reasonable hypotheses. Then, we compare several nonlinear solvers in terms of execution time, robustness with respect to the data and parallel scalability. Our results suggest that Quasi-Newton methods are the best choice for this type of problem, being faster than standard Newton-Krylov methods without hindering their robustness or scalability. In addition, first order methods are also competitive, and represent a better alternative for matrix-free implementations, which are suitable for GPU computing.
Confidence intervals based on the central limit theorem (CLT) are a cornerstone of classical statistics. Despite being only asymptotically valid, they are ubiquitous because they permit statistical inference under very weak assumptions, and can often be applied to problems even when nonasymptotic inference is impossible. This paper introduces time-uniform analogues of such asymptotic confidence intervals. To elaborate, our methods take the form of confidence sequences (CS) -- sequences of confidence intervals that are uniformly valid over time. CSs provide valid inference at arbitrary stopping times, incurring no penalties for "peeking" at the data, unlike classical confidence intervals which require the sample size to be fixed in advance. Existing CSs in the literature are nonasymptotic, and hence do not enjoy the aforementioned broad applicability of asymptotic confidence intervals. Our work bridges the gap by giving a definition for "asymptotic CSs", and deriving a universal asymptotic CS that requires only weak CLT-like assumptions. While the CLT approximates the distribution of a sample average by that of a Gaussian at a fixed sample size, we use strong invariance principles (stemming from the seminal 1970s work of Komlos, Major, and Tusnady) to uniformly approximate the entire sample average process by an implicit Gaussian process. We demonstrate their utility by deriving nonparametric asymptotic CSs for the average treatment effect based on doubly robust estimators in observational studies, for which no nonasymptotic methods can exist even in the fixed-time regime (due to confounding bias). These enable doubly robust causal inference that can be continuously monitored and adaptively stopped.
We consider high-dimensional measurement errors with high-frequency data. Our objective is on recovering the high-dimensional cross-sectional covariance matrix of the random errors with optimality. In this problem, not all components of the random vector are observed at the same time and the measurement errors are latent variables, leading to major challenges besides high data dimensionality. We propose a new covariance matrix estimator in this context with appropriate localization and thresholding, and then conduct a series of comprehensive theoretical investigations of the proposed estimator. By developing a new technical device integrating the high-frequency data feature with the conventional notion of $\alpha$-mixing, our analysis successfully accommodates the challenging serial dependence in the measurement errors. Our theoretical analysis establishes the minimax optimal convergence rates associated with two commonly used loss functions; and we demonstrate with concrete cases when the proposed localized estimator with thresholding achieves the minimax optimal convergence rates. Considering that the variances and covariances can be small in reality, we conduct a second-order theoretical analysis that further disentangles the dominating bias in the estimator. A bias-corrected estimator is then proposed to ensure its practical finite sample performance. We also extensively analyze our estimator in the setting with jumps, and show that its performance is reasonably robust. We illustrate the promising empirical performance of the proposed estimator with extensive simulation studies and a real data analysis.