亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

This paper studies the impact of bootstrap procedure on the eigenvalue distributions of the sample covariance matrix under the high-dimensional factor structure. We provide asymptotic distributions for the top eigenvalues of bootstrapped sample covariance matrix under mild conditions. After bootstrap, the spiked eigenvalues which are driven by common factors will converge weakly to Gaussian limits via proper scaling and centralization. However, the largest non-spiked eigenvalue is mainly determined by order statistics of bootstrap resampling weights, and follows extreme value distribution. Based on the disparate behavior of the spiked and non-spiked eigenvalues, we propose innovative methods to test the number of common factors. According to the simulations and a real data example, the proposed methods are the only ones performing reliably and convincingly under the existence of both weak factors and cross-sectionally correlated errors. Our technical details contribute to random matrix theory on spiked covariance model with convexly decaying density and unbounded support, or with general elliptical distributions.

相關內容

The method of instrumental variables provides a fundamental and practical tool for causal inference in many empirical studies where unmeasured confounding between the treatments and the outcome is present. Modern data such as the genetical genomics data from these studies are often high-dimensional. The high-dimensional linear instrumental-variables regression has been considered in the literature due to its simplicity albeit a true nonlinear relationship may exist. We propose a more data-driven approach by considering the nonparametric additive models between the instruments and the treatments while keeping a linear model between the treatments and the outcome so that the coefficients therein can directly bear causal interpretation. We provide a two-stage framework for estimation and inference under this more general setup. The group lasso regularization is first employed to select optimal instruments from the high-dimensional additive models, and the outcome variable is then regressed on the fitted values from the additive models to identify and estimate important treatment effects. We provide non-asymptotic analysis of the estimation error of the proposed estimator. A debiasing procedure is further employed to yield valid inference. Extensive numerical experiments show that our method can rival or outperform existing approaches in the literature. We finally analyze the mouse obesity data and discuss new findings from our method.

Bayesian nonparametric mixture models are common for modeling complex data. While these models are well-suited for density estimation, their application for clustering has some limitations. Miller and Harrison (2014) proved posterior inconsistency in the number of clusters when the true number of clusters is finite for Dirichlet process and Pitman--Yor process mixture models. In this work, we extend this result to additional Bayesian nonparametric priors such as Gibbs-type processes and finite-dimensional representations of them. The latter include the Dirichlet multinomial process and the recently proposed Pitman--Yor and normalized generalized gamma multinomial processes. We show that mixture models based on these processes are also inconsistent in the number of clusters and discuss possible solutions. Notably, we show that a post-processing algorithm introduced by Guha et al. (2021) for the Dirichlet process extends to more general models and provides a consistent method to estimate the number of components.

We propose a covariance stationarity test for an otherwise dependent and possibly globally non-stationary time series. We work in the new setting of Jin, Wang and Wang (2015) who exploit Walsh (1923) functions (global square waves) in order to compare sub-sample covariances with the full sample counterpart. They impose strict stationarity under the null, only consider linear processes under either hypothesis, and exploit linearity in order to achieve a parametric estimator for an inverted high dimensional asymptotic covariance matrix. Conversely, we allow for linear or linear processes with possibly non-iid innovations. This is important in macroeconomics and finance where nonlinear feedback and random volatility occur in many settings. We completely sidestep asymptotic covariance matrix estimation and inversion by bootstrapping a max-correlation difference statistic, where the maximum is taken over the correlation lag h and Walsh function generated sub-sample counter k (the number of systematic samples). We achieve a higher feasible rate of increase for the maximum lag and counter H and K, and in the supplemental material we present a data driven method for selecting H and K. Of particular note, our test is capable of detecting breaks in variance, and distant, or very mild, deviations from stationarity.

We study the problem of covering and learning sums $X = X_1 + \cdots + X_n$ of independent integer-valued random variables $X_i$ (SIIRVs) with unbounded, or even infinite, support. De et al. at FOCS 2018, showed that the maximum value of the collective support of $X_i$'s necessarily appears in the sample complexity of learning $X$. In this work, we address two questions: (i) Are there general families of SIIRVs with unbounded support that can be learned with sample complexity independent of both $n$ and the maximal element of the support? (ii) Are there general families of SIIRVs with unbounded support that admit proper sparse covers in total variation distance? As for question (i), we provide a set of simple conditions that allow the unbounded SIIRV to be learned with complexity $\text{poly}(1/\epsilon)$ bypassing the aforementioned lower bound. We further address question (ii) in the general setting where each variable $X_i$ has unimodal probability mass function and is a different member of some, possibly multi-parameter, exponential family $\mathcal{E}$ that satisfies some structural properties. These properties allow $\mathcal{E}$ to contain heavy tailed and non log-concave distributions. Moreover, we show that for every $\epsilon > 0$, and every $k$-parameter family $\mathcal{E}$ that satisfies some structural assumptions, there exists an algorithm with $\tilde{O}(k) \cdot \text{poly}(1/\epsilon)$ samples that learns a sum of $n$ arbitrary members of $\mathcal{E}$ within $\epsilon$ in TV distance. The output of the learning algorithm is also a sum of random variables whose distribution lies in the family $\mathcal{E}$. En route, we prove that any discrete unimodal exponential family with bounded constant-degree central moments can be approximated by the family corresponding to a bounded subset of the initial (unbounded) parameter space.

The discrete distribution of the length of longest increasing subsequences in random permutations of order $n$ is deeply related to random matrix theory. In a seminal work, Baik, Deift and Johansson provided an asymptotics in terms of the distribution of the largest level of the large matrix limit of GUE. As a numerical approximation, however, this asymptotics is inaccurate for small lengths and has a slow convergence rate, conjectured to be just of order $n^{-1/3}$. Here, we suggest a different type of approximation, based on Hayman's generalization of Stirling's formula. Such a formula gives already a couple of correct digits of the length distribution for $n$ as small as $20$ but allows numerical evaluations, with a uniform error of apparent order $n^{-3/4}$, for $n$ as large as $10^{12}$; thus closing the gap between a table of exact values (that has recently been compiled for up to $n=1000$) and the random matrix limit. Being much more efficient and accurate than Monte-Carlo simulations for larger $n$, the Stirling-type formula allows for a precise numerical understanding of the first few finite size correction terms to the random matrix limit, a study that has recently been initiated by Forrester and Mays, who visualized the form of the first such term. We display also the second one, of order $n^{-2/3}$, and derive (heuristically) expansions of expected value and variance of the length, exhibiting several more terms than previously known.

In high-dimensional classification problems, a commonly used approach is to first project the high-dimensional features into a lower dimensional space, and base the classification on the resulting lower dimensional projections. In this paper, we formulate a latent-variable model with a hidden low-dimensional structure to justify this two-step procedure and to guide which projection to choose. We propose a computationally efficient classifier that takes certain principal components (PCs) of the observed features as projections, with the number of retained PCs selected in a data-driven way. A general theory is established for analyzing such two-step classifiers based on any projections. We derive explicit rates of convergence of the excess risk of the proposed PC-based classifier. The obtained rates are further shown to be optimal up to logarithmic factors in the minimax sense. Our theory allows the lower-dimension to grow with the sample size and is also valid even when the feature dimension (greatly) exceeds the sample size. Extensive simulations corroborate our theoretical findings. The proposed method also performs favorably relative to other existing discriminant methods on three real data examples.

In randomized experiments and observational studies, weighting methods are often used to generalize and transport treatment effect estimates to a target population. Traditional methods construct the weights by separately modeling the treatment assignment and study selection probabilities and then multiplying functions (e.g., inverses) of their estimates. However, these estimated multiplicative weights may not produce adequate covariate balance and can be highly variable, resulting in biased and unstable estimators, especially when there is limited covariate overlap across populations or treatment groups. To address these limitations, we propose a general weighting approach that weights each treatment group towards the target population in a single step. We present a framework and provide a justification for this one-step approach in terms of generic probability distributions. We show a formal connection between our method and inverse probability and inverse odds weighting. By construction, the proposed approach balances covariates and produces stable estimators. We show that our estimator for the target average treatment effect is consistent, asymptotically Normal, multiply robust, and semiparametrically efficient. We demonstrate the performance of this approach using a simulation study and a randomized case study on the effects of physician racial diversity on preventive healthcare utilization among Black men in California.

We study the deviation inequality for a sum of high-dimensional random matrices and operators with dependence and arbitrary heavy tails. There is an increase in the importance of the problem of estimating high-dimensional matrices, and dependence and heavy-tail properties of data are among the most critical topics currently. In this paper, we derive a dimension-free upper bound on the deviation, that is, the bound does not depend explicitly on the dimension of matrices, but depends on their effective rank. Our result is a generalization of several existing studies on the deviation of the sum of matrices. Our proof is based on two techniques: (i) a variational approximation of the dual of moment generating functions, and (ii) robustification through truncation of eigenvalues of matrices. We show that our results are applicable to several problems such as covariance matrix estimation, hidden Markov models, and overparameterized linear regression models.

We prove a new generalization bound that shows for any class of linear predictors in Gaussian space, the Rademacher complexity of the class and the training error under any continuous loss $\ell$ can control the test error under all Moreau envelopes of the loss $\ell$. We use our finite-sample bound to directly recover the "optimistic rate" of Zhou et al. (2021) for linear regression with the square loss, which is known to be tight for minimal $\ell_2$-norm interpolation, but we also handle more general settings where the label is generated by a potentially misspecified multi-index model. The same argument can analyze noisy interpolation of max-margin classifiers through the squared hinge loss, and establishes consistency results in spiked-covariance settings. More generally, when the loss is only assumed to be Lipschitz, our bound effectively improves Talagrand's well-known contraction lemma by a factor of two, and we prove uniform convergence of interpolators (Koehler et al. 2021) for all smooth, non-negative losses. Finally, we show that application of our generalization bound using localized Gaussian width will generally be sharp for empirical risk minimizers, establishing a non-asymptotic Moreau envelope theory for generalization that applies outside of proportional scaling regimes, handles model misspecification, and complements existing asymptotic Moreau envelope theories for M-estimation.

One of the most important features of financial time series data is volatility. There are often structural changes in volatility over time, and an accurate estimation of the volatility of financial time series requires careful identification of change-points. A common approach to modeling the volatility of time series data is the well-known GARCH model. Although the problem of change-point estimation of volatility dynamics derived from the GARCH model has been considered in the literature, these approaches rely on parametric assumptions of the conditional error distribution, which are often violated in financial time series. This may lead to inaccuracies in change-point detection resulting in unreliable GARCH volatility estimates. This paper introduces a novel change-point detection algorithm based on a semiparametric GARCH model. The proposed method retains the structural advantages of the GARCH process while incorporating the flexibility of nonparametric conditional error distribution. The approach utilizes a penalized likelihood derived from a semiparametric GARCH model and an efficient binary segmentation algorithm. The results show that in terms of change-point estimation and detection accuracy, the semiparametric method outperforms the commonly used Quasi-MLE (QMLE) and other variations of GARCH models in wide-ranging scenarios.

北京阿比特科技有限公司