Fr\'echet global regression is extended to the context of bivariate curve stochastic processes with values in a Riemannian manifold. The proposed regression predictor arises as a reformulation of the standard least-squares parametric linear predictor in terms of a weighted Fr\'echet functional mean. Specifically, in our context, in this reformulation, the Euclidean distance is replaced by the integrated quadratic geodesic distance. The regression predictor is then obtained from the weighted Fr\'echet curve mean, lying in the time-varying geodesic submanifold, generated by the regressor process components involved in the time correlation range. The regularized Fr\'echet weights are computed in the time-varying tangent spaces. The uniform weak-consistency of the regression predictor is proved. Model selection is also addressed. A simulation study is undertaken to illustrate the performance of the spherical curve variable selection algorithm proposed in a multivariate framework.
Functional quadratic regression models postulate a polynomial relationship between a scalar response rather than a linear one. As in functional linear regression, vertical and specially high-leverage outliers may affect the classical estimators. For that reason, the proposal of robust procedures providing reliable estimators in such situations is an important issue. Taking into account that the functional polynomial model is equivalent to a regression model that is a polynomial of the same order in the functional principal component scores of the predictor processes, our proposal combines robust estimators of the principal directions with robust regression estimators based on a bounded loss function and a preliminary residual scale estimator. Fisher-consistency of the proposed method is derived under mild assumptions. The results of a numerical study show, for finite samples, the benefits of the robust proposal over the one based on sample principal directions and least squares. The usefulness of the proposed approach is also illustrated through the analysis of a real data set which reveals that when the potential outliers are removed the classical and robust methods behave very similarly.
Using a perturbation approach, we derive a new approximate filtering and smoothing methodology for a general class of state-space models including univariate and multivariate location, scale, and count data models. The main properties of the methodology can be summarized as follows: (i) it generalizes several existing approaches to robust filtering based on the score and the Hessian matrix of the observation density by relaxing the critical assumption of a Gaussian prior density underlying this class of methods; (ii) has a very simple structure based on forward-backward recursions similar to the Kalman filter and smoother; (iii) allows a straightforward computation of confidence bands around the state estimates reflecting the combination of parameter and filtering uncertainty. We show through an extensive Monte Carlo study that the mean square loss with respect to exact simulation-based methods is small in a wide range of scenarios. We finally illustrate empirically the application of the methodology to the estimation of stochastic volatility and correlations in financial time-series.
The simultaneous estimation of multiple unknown parameters lies at heart of a broad class of important problems across science and technology. Currently, the state-of-the-art performance in the such problems is achieved by nonparametric empirical Bayes methods. However, these approaches still suffer from two major issues. First, they solve a frequentist problem but do so by following Bayesian reasoning, posing a philosophical dilemma that has contributed to somewhat uneasy attitudes toward empirical Bayes methodology. Second, their computation relies on certain density estimates that become extremely unreliable in some complex simultaneous estimation problems. In this paper, we study these issues in the context of the canonical Gaussian sequence problem. We propose an entirely frequentist alternative to nonparametric empirical Bayes methods by establishing a connection between simultaneous estimation and penalized nonparametric regression. We use flexible regularization strategies, such as shape constraints, to derive accurate estimators without appealing to Bayesian arguments. We prove that our estimators achieve asymptotically optimal regret and show that they are competitive with or can outperform nonparametric empirical Bayes methods in simulations and an analysis of spatially resolved gene expression data.
This paper investigates the asymptotic distribution of the maximum-likelihood estimate (MLE) in multinomial logistic models in the high-dimensional regime where dimension and sample size are of the same order. While classical large-sample theory provides asymptotic normality of the MLE under certain conditions, such classical results are expected to fail in high-dimensions as documented for the binary logistic case in the seminal work of Sur and Cand\`es [2019]. We address this issue in classification problems with 3 or more classes, by developing asymptotic normality and asymptotic chi-square results for the multinomial logistic MLE (also known as cross-entropy minimizer) on null covariates. Our theory leads to a new methodology to test the significance of a given feature. Extensive simulation studies on synthetic data corroborate these asymptotic results and confirm the validity of proposed p-values for testing the significance of a given feature.
Studying the generalization abilities of linear models with real data is a central question in statistical learning. While there exist a limited number of prior important works (Loureiro et al. (2021A, 2021B), Wei et al. 2022) that do validate theoretical work with real data, these works have limitations due to technical assumptions. These assumptions include having a well-conditioned covariance matrix and having independent and identically distributed data. These assumptions are not necessarily valid for real data. Additionally, prior works that do address distributional shifts usually make technical assumptions on the joint distribution of the train and test data (Tripuraneni et al. 2021, Wu and Xu 2020), and do not test on real data. In an attempt to address these issues and better model real data, we look at data that is not I.I.D. but has a low-rank structure. Further, we address distributional shift by decoupling assumptions on the training and test distribution. We provide analytical formulas for the generalization error of the denoising problem that are asymptotically exact. These are used to derive theoretical results for linear regression, data augmentation, principal component regression, and transfer learning. We validate all of our theoretical results on real data and have a low relative mean squared error of around 1% between the empirical risk and our estimated risk.
Minimum variance controllers have been employed in a wide-range of industrial applications. A key challenge experienced by many adaptive controllers is their poor empirical performance in the initial stages of learning. In this paper, we address the problem of initializing them so that they provide acceptable transients, and also provide an accompanying finite-time regret analysis, for adaptive minimum variance control of an auto-regressive system with exogenous inputs (ARX). Following [3], we consider a modified version of the Certainty Equivalence (CE) adaptive controller, which we call PIECE, that utilizes probing inputs for exploration. We show that it has a $C \log T$ bound on the regret after $T$ time-steps for bounded noise, and $C\log^2 T$ in the case of sub-Gaussian noise. The simulation results demonstrate the advantage of PIECE over the algorithm proposed in [3] as well as the standard Certainty Equivalence controller especially in the initial learning phase. To the best of our knowledge, this is the first work that provides finite-time regret bounds for an adaptive minimum variance controller.
Sparse linear regression is a central problem in high-dimensional statistics. We study the correlated random design setting, where the covariates are drawn from a multivariate Gaussian $N(0,\Sigma)$, and we seek an estimator with small excess risk. If the true signal is $t$-sparse, information-theoretically, it is possible to achieve strong recovery guarantees with only $O(t\log n)$ samples. However, computationally efficient algorithms have sample complexity linear in (some variant of) the condition number of $\Sigma$. Classical algorithms such as the Lasso can require significantly more samples than necessary even if there is only a single sparse approximate dependency among the covariates. We provide a polynomial-time algorithm that, given $\Sigma$, automatically adapts the Lasso to tolerate a small number of approximate dependencies. In particular, we achieve near-optimal sample complexity for constant sparsity and if $\Sigma$ has few ``outlier'' eigenvalues. Our algorithm fits into a broader framework of feature adaptation for sparse linear regression with ill-conditioned covariates. With this framework, we additionally provide the first polynomial-factor improvement over brute-force search for constant sparsity $t$ and arbitrary covariance $\Sigma$.
The angular measure on the unit sphere characterizes the first-order dependence structure of the components of a random vector in extreme regions and is defined in terms of standardized margins. Its statistical recovery is an important step in learning problems involving observations far away from the center. In the common situation that the components of the vector have different distributions, the rank transformation offers a convenient and robust way of standardizing data in order to build an empirical version of the angular measure based on the most extreme observations. We provide a functional asymptotic expansion for the empirical angular measure in the bivariate case based on the theory of weak convergence in the space of bounded functions. From the expansion, not only can the known asymptotic distribution of the empirical angular measure be recovered, it also enables to find expansions and weak limits for other statistics based on the associated empirical process or its quantile version.
We propose the first theoretical and methodological framework for Gaussian process regression subject to privacy constraints. The proposed method can be used when a data owner is unwilling to share a high-fidelity supervised learning model built from their data with the public due to privacy concerns. The key idea of the proposed method is to add synthetic noise to the data until the predictive variance of the Gaussian process model reaches a prespecified privacy level. The optimal covariance matrix of the synthetic noise is formulated in terms of semi-definite programming. We also introduce the formulation of privacy-aware solutions under continuous privacy constraints using kernel-based approaches, and study their theoretical properties. The proposed method is illustrated by considering a model that tracks the trajectories of satellites.
There is increasing interest in modeling high-dimensional longitudinal outcomes in applications such as developmental neuroimaging research. Growth curve model offers a useful tool to capture both the mean growth pattern across individuals, as well as the dynamic changes of outcomes over time within each individual. However, when the number of outcomes is large, it becomes challenging and often infeasible to tackle the large covariance matrix of the random effects involved in the model. In this article, we propose a high-dimensional response growth curve model, with three novel components: a low-rank factor model structure that substantially reduces the number of parameters in the large covariance matrix, a re-parameterization formulation coupled with a sparsity penalty that selects important fixed and random effect terms, and a computational trick that turns the inversion of a large matrix into the inversion of a stack of small matrices and thus considerably speeds up the computation. We develop an efficient expectation-maximization type estimation algorithm, and demonstrate the competitive performance of the proposed method through both simulations and a longitudinal study of brain structural connectivity in association with human immunodeficiency virus.