亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

This paper considers the estimation and inference of the low-rank components in high-dimensional matrix-variate factor models, where each dimension of the matrix-variates ($p \times q$) is comparable to or greater than the number of observations ($T$). We propose an estimation method called $\alpha$-PCA that preserves the matrix structure and aggregates mean and contemporary covariance through a hyper-parameter $\alpha$. We develop an inferential theory, establishing consistency, the rate of convergence, and the limiting distributions, under general conditions that allow for correlations across time, rows, or columns of the noise. We show both theoretical and empirical methods of choosing the best $\alpha$, depending on the use-case criteria. Simulation results demonstrate the adequacy of the asymptotic results in approximating the finite sample properties. The $\alpha$-PCA compares favorably with the existing ones. Finally, we illustrate its applications with a real numeric data set and two real image data sets. In all applications, the proposed estimation procedure outperforms previous methods in the power of variance explanation using out-of-sample 10-fold cross-validation.

相關內容

We propose the tensorizing flow method for estimating high-dimensional probability density functions from the observed data. The method is based on tensor-train and flow-based generative modeling. Our method first efficiently constructs an approximate density in the tensor-train form via solving the tensor cores from a linear system based on the kernel density estimators of low-dimensional marginals. We then train a continuous-time flow model from this tensor-train density to the observed empirical distribution by performing a maximum likelihood estimation. The proposed method combines the optimization-less feature of the tensor-train with the flexibility of the flow-based generative models. Numerical results are included to demonstrate the performance of the proposed method.

A factor copula model is proposed in which factors are either simulable or estimable from exogenous information. Point estimation and inference are based on a simulated methods of moments (SMM) approach with non-overlapping simulation draws. Consistency and limiting normality of the estimator is established and the validity of bootstrap standard errors is shown. Doing so, previous results from the literature are verified under low-level conditions imposed on the individual components of the factor structure. Monte Carlo evidence confirms the accuracy of the asymptotic theory in finite samples and an empirical application illustrates the usefulness of the model to explain the cross-sectional dependence between stock returns.

Methods to identify cause-effect relationships currently mostly assume the variables to be scalar random variables. However, in many fields the objects of interest are vectors or groups of scalar variables. We present a new constraint-based non-parametric approach for inferring the causal relationship between two vector-valued random variables from observational data. Our method employs sparsity estimates of directed and undirected graphs and is based on two new principles for groupwise causal reasoning that we justify theoretically in Pearl's graphical model-based causality framework. Our theoretical considerations are complemented by two new causal discovery algorithms for causal interactions between two random vectors which find the correct causal direction reliably in simulations even if interactions are nonlinear. We evaluate our methods empirically and compare them to other state-of-the-art techniques.

This paper studies the high-dimensional quantile regression problem under the transfer learning framework, where possibly related source datasets are available to make improvements on the estimation or prediction based solely on the target data. In the oracle case with known transferable sources, a smoothed two-step transfer learning algorithm based on convolution smoothing is proposed and the L1/L2 estimation error bounds of the corresponding estimator are also established. To avoid including non-informative sources, we propose a clustering-based algorithm to select the transferable sources adaptively and establish its selection consistency under regular conditions; we also provide an alternative model averaging procedure, of which the optimality of the excess risk is proved. Monte Carlo simulations as well as an empirical analysis of gene expression data demonstrate the effectiveness of the proposed procedure.

The design of experiments involves an inescapable compromise between covariate balance and robustness. This paper provides a formalization of this trade-off and introduces an experimental design that allows experimenters to navigate it. The design is specified by a robustness parameter that bounds the worst-case mean squared error of an estimator of the average treatment effect. Subject to the experimenter's desired level of robustness, the design aims to simultaneously balance all linear functions of potentially many covariates. The achieved level of balance is better than previously known possible and considerably better than what a fully random assignment would produce. We show that the mean squared error of the estimator is bounded by the minimum of the loss function of an implicit ridge regression of the potential outcomes on the covariates. The estimator does not itself conduct covariate adjustment, so one can interpret the approach as regression adjustment by design. Finally, we provide both a central limit theorem and non-asymptotic tail bounds for the estimator, which facilitate the construction of confidence intervals.

In various applications, we deal with high-dimensional positive-valued data that often exhibits sparsity. This paper develops a new class of continuous global-local shrinkage priors tailored to analyzing gamma-distributed observations where most of the underlying means are concentrated around a certain value. Unlike existing shrinkage priors, our new prior is a shape-scale mixture of inverse-gamma distributions, which has a desirable interpretation of the form of posterior mean and admits flexible shrinkage. We show that the proposed prior has two desirable theoretical properties; Kullback-Leibler super-efficiency under sparsity and robust shrinkage rules for large observations. We propose an efficient sampling algorithm for posterior inference. The performance of the proposed method is illustrated through simulation and two real data examples, the average length of hospital stay for COVID-19 in South Korea and adaptive variance estimation of gene expression data.

In this paper, we investigate the random subsampling method for tensor least squares problem with respect to the popular t-product. From the optimization perspective, we present the error bounds in the sense of probability for the residual and solution obtained by the proposed method. From the statistical perspective, we derive the expressions of the conditional and unconditional expectations and variances for the solution, where the unconditional ones combine the model noises. Moreover, based on the unconditional variance, an optimal subsampling probability distribution is also found. Finally, the feasibility and effectiveness of the proposed method and the correctness of the theoretical results are verified by numerical experiments.

To date, the comparison of Statistical Shape Models (SSMs) is often solely performance-based and carried out by means of simplistic metrics such as compactness, generalization, or specificity. Any similarities or differences between the actual shape spaces can neither be visualized nor quantified. In this paper, we present a first method to compare two SSMs in dense correspondence by computing approximate intersection spaces and set-theoretic differences between the affine vector spaces spanned by the models. To this end, we approximate the distribution of shapes lying in the intersection space using Markov Chain Monte Carlo, and then apply Principal Component Analysis (PCA) to its samples. By representing the resulting spaces again as an SSM, our method enables an easy and intuitive analysis of similarities between two model's shape spaces. We estimate differences between SSMs in a similar manner; here, however, the resulting shape spaces are not linear vector spaces anymore and we do not apply PCA but instead use the posterior samples for visualization. We showcase the proposed algorithm qualitatively by computing and analyzing intersection spaces and differences between publicly available face models focusing on gender-specific male and female as well as identity and expression models. Our quantitative evaluation based on SSMs built from synthetic and real-world data sets provides detailed evidence that the introduced method is able to recover ground-truth intersection spaces and differences. Finally, we demonstrate that the proposed algorithm can be easily adapted to also compute intersections and differences between color spaces.

We consider the problem of testing whether a single coefficient is equal to zero in high-dimensional fixed-design linear models. In the high-dimensional setting where the dimension of covariates $p$ is allowed to be in the same order of magnitude as sample size $n$, to achieve finite-population validity, existing methods usually require strong distributional assumptions on the noise vector (such as Gaussian or rotationally invariant), which limits their applications in practice. In this paper, we propose a new method, called \emph{residual permutation test} (RPT), which is constructed by projecting the regression residuals onto the space orthogonal to the union of the column spaces of the original and permuted design matrices. RPT can be proved to achieve finite-population size validity under fixed design with just exchangeable noises, whenever $p < n / 2$. Moreover, RPT is shown to be asymptotically powerful for heavy tailed noises with bounded $(1+t)$-th order moment when the true coefficient is at least of order $n^{-t/(1+t)}$ for $t \in [0,1]$. We further proved that this signal size requirement is essentially optimal in the minimax sense. Numerical studies confirm that RPT performs well in a wide range of simulation settings with normal and heavy-tailed noise distributions.

Understanding the oscillating behaviors that govern organisms' internal biological processes requires interdisciplinary efforts combining both biological and computer experiments, as the latter can complement the former by simulating perturbed conditions with higher resolution. Harmonizing the two types of experiment, however, poses significant statistical challenges due to identifiability issues, numerical instability, and ill behavior in high dimension. This article devises a new Bayesian calibration framework for oscillating biochemical models. The proposed Bayesian model is estimated relying on an advanced Markov chain Monte Carlo (MCMC) technique which can efficiently infer the parameter values that match the simulated and observed oscillatory processes. Also proposed is an approach to sensitivity analysis based on the intervention posterior. This approach measures the influence of individual parameters on the target process by using the obtained MCMC samples as a computational tool. The proposed framework is illustrated with circadian oscillations observed in a filamentous fungus, Neurospora crassa.

北京阿比特科技有限公司