Necessary and sufficient conditions of uniform consistency are explored. Nonparametric sets of alternatives are bounded convex sets in $\mathbb{L}_p$ with "small" balls deleted. The "small" balls have the center at the point of hypothesis and radii of balls tend to zero as sample size increases. For problem of hypothesis testing on a density, we show that, for the sets of alternatives, there are uniformly consistent tests for some sequence of radii of the balls, if and only if, convex set is compact. The results are established for problem of hypothesis testing on a density, for signal detection in Gaussian white noise, for linear ill-posed problems with random Gaussian noise and so on.
This paper presents a robust version of the stratified sampling method when multiple uncertain input models are considered for stochastic simulation. Various variance reduction techniques have demonstrated their superior performance in accelerating simulation processes. Nevertheless, they often use a single input model and further assume that the input model is exactly known and fixed. We consider more general cases in which it is necessary to assess a simulation's response to a variety of input models, such as when evaluating the reliability of wind turbines under nonstationary wind conditions or the operation of a service system when the distribution of customer inter-arrival time is heterogeneous at different times. Moreover, the estimation variance may be considerably impacted by uncertainty in input models. To address such nonstationary and uncertain input models, we offer a distributionally robust (DR) stratified sampling approach with the goal of minimizing the maximum of worst-case estimator variances among plausible but uncertain input models. Specifically, we devise a bi-level optimization framework for formulating DR stochastic problems with different ambiguity set designs, based on the $L_2$-norm, 1-Wasserstein distance, parametric family of distributions, and distribution moments. In order to cope with the non-convexity of objective function, we present a solution approach that uses Bayesian optimization. Numerical experiments and the wind turbine case study demonstrate the robustness of the proposed approach.
I propose Ziv-Zakai-type lower bounds on the Bayesian error for estimating a parameter $\beta:\Theta \to \mathbb R$ when the parameter space $\Theta$ is general and $\beta(\theta)$ need not be a linear function of $\theta$.
In the rank-constrained optimization problem (RCOP), it minimizes a linear objective function over a prespecified closed rank-constrained domain set and $m$ generic two-sided linear matrix inequalities. Motivated by the Dantzig-Wolfe (DW) decomposition, a popular approach of solving many nonconvex optimization problems, we investigate the strength of DW relaxation (DWR) of the RCOP, which admits the same formulation as RCOP except replacing the domain set by its closed convex hull. Notably, our goal is to characterize conditions under which the DWR matches RCOP for any m two-sided linear matrix inequalities. From the primal perspective, we develop the first-known simultaneously necessary and sufficient conditions that achieve: (i) extreme point exactness -- all the extreme points of the DWR feasible set belong to that of the RCOP; (ii) convex hull exactness -- the DWR feasible set is identical to the closed convex hull of RCOP feasible set; and (iii) objective exactness -- the optimal values of the DWR and RCOP coincide. The proposed conditions unify, refine, and extend the existing exactness results in the quadratically constrained quadratic program (QCQP) and fair unsupervised learning. These conditions can be very useful to identify new results, including the extreme point exactness for a QCQP problem that admits an inhomogeneous objective function with two homogeneous two-sided quadratic constraints and the convex hull exactness for fair SVD.
There is an emerging need for comparable data for multi-microphone processing, particularly in acoustic sensor networks. However, commonly available databases are often limited in the spatial diversity of the microphones or only allow for particular signal processing tasks. In this paper, we present a database of acoustic impulse responses and recordings for a binaural hearing aid setup, 36 spatially distributed microphones spanning a uniform grid of (5x5) m^2 and 12 source positions. This database can be used for a variety of signal processing tasks, such as (multi-microphone) noise reduction, source localization, and dereverberation, as the measurements were performed using the same setup for three different reverberation conditions (T_60\approx{310, 510, 1300} ms). The usability of the database is demonstrated for a noise reduction task using a minimum variance distortionless response beamformer based on relative transfer functions, exploiting the availability of spatially distributed microphones.
In this work we connect two notions: That of the nonparametric mode of a probability measure, defined by asymptotic small ball probabilities, and that of the Onsager-Machlup functional, a generalized density also defined via asymptotic small ball probabilities. We show that in a separable Hilbert space setting and under mild conditions on the likelihood, modes of a Bayesian posterior distribution based upon a Gaussian prior exist and agree with the minimizers of its Onsager-Machlup functional and thus also with weak posterior modes. We apply this result to inverse problems and derive conditions on the forward mapping under which this variational characterization of posterior modes holds. Our results show rigorously that in the limit case of infinite-dimensional data corrupted by additive Gaussian or Laplacian noise, nonparametric maximum a posteriori estimation is equivalent to Tikhonov-Phillips regularization. In comparison with the work of Dashti, Law, Stuart, and Voss (2013), the assumptions on the likelihood are relaxed so that they cover in particular the important case of white Gaussian process noise. We illustrate our results by applying them to a severely ill-posed linear problem with Laplacian noise, where we express the maximum a posteriori estimator analytically and study its rate of convergence in the small noise limit.
We provide new false discovery proportion (FDP) confidence envelopes in several multiple testing settings relevant to modern high dimensional-data methods. We revisit the scenarios considered in the recent work of \cite{katsevich2020simultaneous}(top-$k$, preordered -- including knockoffs -- , online) with a particular emphasis on obtaining FDP bounds that have both non-asymptotical coverage and asymptotical consistency, i.e. converge below the desired level $\alpha$ when applied to a classical $\alpha$-level false discovery rate (FDR) controlling procedure. This way, we derive new bounds that provide improvements over existing ones, both theoretically and practically, and are suitable for situations where at least a moderate number of rejections is expected. These improvements are illustrated with numerical experiments and real data examples. In particular, the improvement is significant in the knockoff setting, which shows the impact of the method for practical use. As side results, we introduce a new confidence envelope for the empirical cumulative distribution function of i.i.d. uniform variables and we provide new power results in sparse cases, both being of independent interest.
Pini and Vantini (2017) introduced the interval-wise testing procedure which performs local inference for functional data defined on an interval domain, where the output is an adjusted p-value function that controls for type I errors. We extend this idea to a general setting where domain is a Riemannian manifolds. This requires new methodology such as how to define adjustment sets on product manifolds and how to approximate the test statistic when the domain has non-zero curvature. We propose to use permutation tests for inference and apply the procedure in three settings: a simulation on a "chameleon-shaped" manifold and two applications related to climate change where the manifolds are a complex subset of $S^2$ and $S^2 \times S^1$, respectively. We note the tradeoff between type I and type II errors: increasing the adjustment set reduces the type I error but also results in smaller areas of significance. However, some areas still remain significant even at maximal adjustment.
The causal dose response curve is commonly selected as the statistical parameter of interest in studies where the goal is to understand the effect of a continuous exposure on an outcome.Most of the available methodology for statistical inference on the dose-response function in the continuous exposure setting requires strong parametric assumptions on the probability distribution. Such parametric assumptions are typically untenable in practice and lead to invalid inference. It is often preferable to instead use nonparametric methods for inference, which only make mild assumptions about the data-generating mechanism. We propose a nonparametric test of the null hypothesis that the dose-response function is equal to a constant function. We argue that when the null hypothesis holds, the dose-response function has zero variance. Thus, one can test the null hypothesis by assessing whether there is sufficient evidence to claim that the variance is positive. We construct a novel estimator for the variance of the dose-response function, for which we can fully characterize the null limiting distribution and thus perform well-calibrated tests of the null hypothesis. We also present an approach for constructing simultaneous confidence bands for the dose-response function by inverting our proposed hypothesis test. We assess the validity of our proposal in a simulation study. In a data example, we study, in a population of patients who have initiated treatment for HIV, how the distance required to travel to an HIV clinic affects retention in care.
Asmussen and Lehtomaa [Distinguishing log-concavity from heavy tails. Risks 5(10), 2017] introduced an interesting function $g$ which is able to distinguish between log-convex and log-concave tail behaviour of distributions, and proposed a randomized estimator for $g$. In this paper, we show that $g$ can also be seen as a tool to detect gamma distributions or distributions with gamma tail. We construct a more efficient estimator $\hat{g}_n$ based on $U$-statistics, propose several estimators of the (asymptotic) variance of $\hat{g}_n$, and study their performance by simulations. Finally, the methods are applied to several real data sets.
Modern reinforcement learning systems produce many high-quality policies throughout the learning process. However, to choose which policy to actually deploy in the real world, they must be tested under an intractable number of environmental conditions. We introduce RPOSST, an algorithm to select a small set of test cases from a larger pool based on a relatively small number of sample evaluations. RPOSST treats the test case selection problem as a two-player game and optimizes a solution with provable $k$-of-$N$ robustness, bounding the error relative to a test that used all the test cases in the pool. Empirical results demonstrate that RPOSST finds a small set of test cases that identify high quality policies in a toy one-shot game, poker datasets, and a high-fidelity racing simulator.