Estimating a covariance matrix is central to high-dimensional data analysis. The proposed method is motivated by the dependence pattern analyses of multiple types of high-dimensional biomedical data including but not limited to genomics, proteomics, microbiome, and neuroimaging data. The correlation matrices of these biomedical data all demonstrate a well-organized block pattern. In this pattern, the positive and negative pair-wise correlations with large absolute values, are mainly concentrated within diagonal and off-diagonal blocks. We develop a covariance- and precision-matrix estimation framework to fully leverage the organized block pattern. We propose new best unbiased covariance- and precision-matrix estimators in closed forms, and develop theories for the asymptotic proprieties of estimators in both scenarios where the number of blocks is less or greater than the sample size. The simulation and data example analyses show that our method is robust and improves the accuracy of covariance- and precision-matrix estimation.
We prove new lower bounds for statistical estimation tasks under the constraint of $(\varepsilon, \delta)$-differential privacy. First, we provide tight lower bounds for private covariance estimation of Gaussian distributions. We show that estimating the covariance matrix in Frobenius norm requires $\Omega(d^2)$ samples, and in spectral norm requires $\Omega(d^{3/2})$ samples, both matching upper bounds up to logarithmic factors. The latter bound verifies the existence of a conjectured statistical gap between the private and the non-private sample complexities for spectral estimation of Gaussian covariances. We prove these bounds via our main technical contribution, a broad generalization of the fingerprinting method to exponential families. Additionally, using the private Assouad method of Acharya, Sun, and Zhang, we show a tight $\Omega(d/(\alpha^2 \varepsilon))$ lower bound for estimating the mean of a distribution with bounded covariance to $\alpha$-error in $\ell_2$-distance. Prior known lower bounds for all these problems were either polynomially weaker or held under the stricter condition of $(\varepsilon, 0)$-differential privacy.
Gaussian processes are widely employed as versatile modelling and predictive tools in spatial statistics, functional data analysis, computer modelling and diverse applications of machine learning. They have been widely studied over Euclidean spaces, where they are specified using covariance functions or covariograms for modelling complex dependencies. There is a growing literature on Gaussian processes over Riemannian manifolds in order to develop richer and more flexible inferential frameworks for non-Euclidean data. While numerical approximations through graph representations have been well studied for the Mat\'ern covariogram and heat kernel, the behaviour of asymptotic inference on the parameters of the covariogram has received relatively scant attention. We focus on the asymptotic inference for Gaussian processes constructed over compact Riemannian manifolds. Building upon the recently introduced Mat\'ern covariogram on a compact Riemannian manifold, we employ formal notions and conditions for the equivalence of two Mat\'ern Gaussian random measures on compact manifolds to derive the parameter that is identifiable, also known as the microergodic parameter, and formally establish the consistency of the maximum likelihood estimate and the asymptotic optimality of the best linear unbiased predictor. The circle is studied as a specific example of compact Riemannian manifolds with numerical experiments to illustrate and corroborate the theory.
Dynamic subspace estimation, or subspace tracking, is a fundamental problem in statistical signal processing and machine learning. This paper considers a geodesic model for time-varying subspaces. The natural objective function for this model is non-convex. We propose a novel algorithm for minimizing this objective and estimating the parameters of the model from data with Grassmannian-constrained optimization. We show that with this algorithm, the objective is monotonically non-increasing. We demonstrate the performance of this model and our algorithm on synthetic data, video data, and dynamic fMRI data.
We derive a formula for optimal hard thresholding of the singular value decomposition in the presence of correlated additive noise; although it nominally involves unobservables, we show how to apply it even where the noise covariance structure is not a-priori known or is not independently estimable. The proposed method, which we call ScreeNOT, is a mathematically solid alternative to Cattell's ever-popular but vague Scree Plot heuristic from 1966. ScreeNOT has a surprising oracle property: it typically achieves exactly, in large finite samples, the lowest possible MSE for matrix recovery, on each given problem instance - i.e. the specific threshold it selects gives exactly the smallest achievable MSE loss among all possible threshold choices for that noisy dataset and that unknown underlying true low rank model. The method is computationally efficient and robust against perturbations of the underlying covariance structure. Our results depend on the assumption that the singular values of the noise have a limiting empirical distribution of compact support; this model, which is standard in random matrix theory, is satisfied by many models exhibiting either cross-row correlation structure or cross-column correlation structure, and also by many situations where there is inter-element correlation structure. Simulations demonstrate the effectiveness of the method even at moderate matrix sizes. The paper is supplemented by ready-to-use software packages implementing the proposed algorithm: package ScreeNOT in Python (via PyPI) and R (via CRAN).
Modelling the extremal dependence of bivariate variables is important in a wide variety of practical applications, including environmental planning, catastrophe modelling and hydrology. The majority of these approaches are based on the framework of bivariate regular variation, and a wide range of literature is available for estimating the dependence structure in this setting. However, this framework is only applicable to variables exhibiting asymptotic dependence, even though asymptotic independence is often observed in practice. In this paper, we consider the so-called `angular dependence function'; this quantity summarises the extremal dependence structure for asymptotically independent variables. Until recently, only pointwise estimators of the angular dependence function have been available. We introduce a range of global estimators and compare them to another recently introduced technique for global estimation through a systematic simulation study, and a case study on river flow data from the north of England, UK.
Motivated by crowd-sourcing applications, we consider a model where we have partial observations from a bivariate isotonic n x d matrix with an unknown permutation $\pi$ * acting on its rows. Focusing on the twin problems of recovering the permutation $\pi$ * and estimating the unknown matrix, we introduce a polynomial-time procedure achieving the minimax risk for these two problems, this for all possible values of n, d, and all possible sampling efforts. Along the way, we establish that, in some regimes, recovering the unknown permutation $\pi$ * is considerably simpler than estimating the matrix.
Sparse regression codes (SPARC) connect the sparse signal recovery framework of compressive sensing with error control coding techniques. SPARC encoding produces codewords which are \emph{sparse} linear combinations of columns of a dictionary matrix. SPARC decoding is accomplished using sparse signal recovery algorithms. We construct dictionary matrices using Gold codes and mutually unbiased bases and develop suitable generalizations of SPARC (GSPARC). We develop a greedy decoder, referred as match and decode (MAD) algorithm and provide its analytical noiseless recovery guarantees. We propose a parallel greedy search technique, referred as parallel MAD (PMAD), to improve the performance. We describe the applicability of GSPARC with PMAD decoder for multi-user channels, providing a non-orthogonal multiple access scheme. We present numerical results comparing the block error rate (BLER) performance of the proposed algorithms for GSPARC in AWGN channels, in the short block length regime. The PMAD decoder gives better BLER than the approximate message passing decoder for SPARC. GSPARC with PMAD gives comparable and competitive BLER performance, when compared to other existing codes. In multi-user channels, GSPARC with PMAD decoder outperforms the sphere packing lower bounds of an orthogonal multiple access scheme, which has the same spectral efficiency.
During an infectious disease outbreak, public health decision-makers require real-time monitoring of disease transmission to respond quickly and intelligently. In these settings, a key measure of transmission is the instantaneous time-varying reproduction number, $R_t$. Estimation of this number using a Time-Since-Infection model relies on case-notification data and the distribution of the serial interval on the target population. However, in practice, case-notification data may contain measurement error due to variation in case reporting while available serial interval estimates may come from studies on non-representative populations. We propose a new data-driven method that accounts for particular forms of case-reporting measurement error and can incorporate multiple partially representative serial interval estimates into the transmission estimation process. In addition, we provide practical tools for automatically identifying measurement error patterns and determining when measurement error may not be adequately accounted for. We illustrate the potential bias undertaken by methods that ignore these practical concerns through a variety of simulated outbreaks. We then demonstrate the use of our method on data from the COVID-19 pandemic to estimate transmission and explore the relationships between social distancing, temperature, and transmission.
We propose a novel method for joint estimation of shape and pose of rigid objects from their sequentially observed RGB-D images. In sharp contrast to past approaches that rely on complex non-linear optimization, we propose to formulate it as a neural optimization that learns to efficiently estimate the shape and pose. We introduce Deep Directional Distance Function (DeepDDF), a neural network that directly outputs the depth image of an object given the camera viewpoint and viewing direction, for efficient error computation in 2D image space. We formulate the joint estimation itself as a Transformer which we refer to as TransPoser. We fully leverage the tokenization and multi-head attention to sequentially process the growing set of observations and to efficiently update the shape and pose with a learned momentum, respectively. Experimental results on synthetic and real data show that DeepDDF achieves high accuracy as a category-level object shape representation and TransPoser achieves state-of-the-art accuracy efficiently for joint shape and pose estimation.
Existing machine learning methods for causal inference usually estimate quantities expressed via the mean of potential outcomes (e.g., average treatment effect). However, such quantities do not capture the full information about the distribution of potential outcomes. In this work, we estimate the density of potential outcomes after interventions from observational data. For this, we propose a novel, fully-parametric deep learning method called Interventional Normalizing Flows. Specifically, we combine two normalizing flows, namely (i) a nuisance flow for estimating nuisance parameters and (ii) a target flow for a parametric estimation of the density of potential outcomes. We further develop a tractable optimization objective based on a one-step bias correction for an efficient and doubly robust estimation of the target flow parameters. As a result our Interventional Normalizing Flows offer a properly normalized density estimator. Across various experiments, we demonstrate that our Interventional Normalizing Flows are expressive and highly effective, and scale well with both sample size and high-dimensional confounding. To the best of our knowledge, our Interventional Normalizing Flows are the first proper fully-parametric, deep learning method for density estimation of potential outcomes.