亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In this paper we derive sharp lower and upper bounds for the covariance of two bounded random variables when knowledge about their expected values, variances or both is available. When only the expected values are known, our result can be viewed as an extension of the Bhatia-Davis Inequality for variances. We also provide a number of different ways to standardize covariance. For a binary pair random variables, one of these standardized measures of covariation agrees with a frequently used measure of dependence between genetic variants.

相關內容

Many numerical methods for evaluating matrix functions can be naturally viewed as computational graphs. Rephrasing these methods as directed acyclic graphs (DAGs) is a particularly effective approach to study existing techniques, improve them, and eventually derive new ones. The accuracy of these matrix techniques can be characterized by the accuracy of their scalar counterparts, thus designing algorithms for matrix functions can be regarded as a scalar-valued optimization problem. The derivatives needed during the optimization can be calculated automatically by exploiting the structure of the DAG, in a fashion analogous to backpropagation. This paper describes GraphMatFun.jl, a Julia package that offers the means to generate and manipulate computational graphs, optimize their coefficients, and generate Julia, MATLAB, and C code to evaluate them efficiently at a matrix argument. The software also provides tools to estimate the accuracy of a graph-based algorithm and thus obtain numerically reliable methods. For the exponential, for example, using a particular form (degree-optimal) of polynomials produces implementations that in many cases are cheaper, in terms of computational cost, than the Pad\'e-based techniques typically used in mathematical software. The optimized graphs and the corresponding generated code are available online.

We consider the problem of estimating high-dimensional covariance matrices of $K$-populations or classes in the setting where the samples sizes are comparable to the data dimension. We propose estimating each class covariance matrix as a distinct linear combination of all class sample covariance matrices. This approach is shown to reduce the estimation error when the sample sizes are limited, and the true class covariance matrices share a somewhat similar structure. We develop an effective method for estimating the coefficients in the linear combination that minimize the mean squared error under the general assumption that the samples are drawn from (unspecified) elliptically symmetric distributions possessing finite fourth-order moments. To this end, we utilize the spatial sign covariance matrix, which we show (under rather general conditions) to be an unbiased estimator of the normalized covariance matrix as the dimension grows to infinity. We also show how the proposed method can be used in choosing the regularization parameters for multiple target matrices in a single class covariance matrix estimation problem. We assess the proposed method via numerical simulation studies including an application in global minimum variance portfolio optimization using real stock data.

We provide an empirical process theory for locally stationary processes over nonsmooth function classes. An important novelty over other approaches is the use of the flexible functional dependence measure to quantify dependence. A functional central limit theorem and nonasymptotic maximal inequalities are provided. The theory is used to prove the functional convergence of the empirical distribution function (EDF) and to derive uniform convergence rates for kernel density estimators both for stationary and locally stationary processes. A comparison with earlier results based on other measures of dependence is carried out.

The Wilcoxon rank-sum test is one of the most popular distribution-free procedures for testing the equality of two univariate probability distributions. One of the main reasons for its popularity can be attributed to the remarkable result of Hodges and Lehmann (1956), which shows that the asymptotic relative efficiency of Wilcoxon's test with respect to Student's $t$-test, under location alternatives, never falls below 0.864, despite the former being exactly distribution-free for all sample sizes. Even more striking is the result of Chernoff and Savage (1958), which shows that the efficiency of a Gaussian score transformed Wilcoxon's test, against the $t$-test, is lower bounded by 1. In this paper we study the two-sample problem in the multivariate setting and propose distribution-free analogues of the Hotelling $T^2$ test (the natural multidimensional counterpart of Student's $t$-test) based on optimal transport and obtain extensions of the above celebrated results over various natural families of multivariate distributions. Our proposed tests are consistent against a general class of alternatives and satisfy Hodges-Lehmann and Chernoff-Savage-type efficiency lower bounds, despite being entirely agnostic to the underlying data generating mechanism. In particular, a collection of our proposed tests suffer from no loss in asymptotic efficiency, when compared to Hotelling $T^2$. To the best of our knowledge, these are the first collection of multivariate, nonparametric, exactly distribution-free tests that provably achieve such attractive efficiency lower bounds. We also demonstrate the broader scope of our methods in optimal transport based nonparametric inference by constructing exactly distribution-free multivariate tests for mutual independence, which suffer from no loss in asymptotic efficiency against the classical Wilks' likelihood ratio test, under Konijn alternatives.

We consider the deviation inequalities for the sums of independent $d$ by $d$ random matrices, as well as rank one random tensors. Our focus is on the non-isotropic case and the bounds that do not depend explicitly on the dimension $d$, but rather on the effective rank. In a rather elementary and unified way, we show the following results: 1) A deviation bound for the sums of independent positive-semi-definite matrices of any rank. This result generalizes the dimension-free bound of Koltchinskii and Lounici [Bernoulli, 23(1): 110-133, 2017] on the sample covariance matrix in the sub-Gaussian case. 2) Dimension-free bounds for the operator norm of the sums of random tensors of rank one formed either by sub-Gaussian or log-concave random vectors. This extends the result of Guedon and Rudelson [Adv. in Math., 208: 798-823, 2007]. 3) A non-isotropic version of the result of Alesker [Geom. Asp. of Funct. Anal., 77: 1--4, 1995] on the concentration of the norm of sub-exponential random vectors. 4) A dimension-free lower tail bound for sums of positive semi-definite matrices with heavy-tailed entries, sharpening the bound of Oliveira [Prob. Th. and Rel. Fields, 166: 1175-1194, 2016]. Our approach is based on the duality formula between entropy and moment generating functions. In contrast to the known proofs of dimension-free bounds, we avoid Talagrand's majorizing measure theorem, as well as generic chaining bounds for empirical processes. Some of our tools were pioneered by O. Catoni and co-authors in the context of robust statistical estimation.

Time-to-event endpoints show an increasing popularity in phase II cancer trials. The standard statistical tool for such endpoints in one-armed trials is the one-sample log-rank test. It is widely known, that the asymptotic providing the correctness of this test does not come into effect to full extent for small sample sizes. There have already been some attempts to solve this problem. While some do not allow easy power and sample size calculations, others lack a clear theoretical motivation and require further considerations. The problem itself can partly be attributed to the dependence of the compensated counting process and its variance estimator. We provide a framework in which the variance estimator can be flexibly adopted to the present situation while maintaining its asymptotical properties. We exemplarily suggest a variance estimator which is uncorrelated to the compensated counting process. Furthermore, we provide sample size and power calculations for any approach fitting into our framework. Finally, we compare several methods via simulation studies and the hypothetical setup of a Phase II trial based on real world data.

We develop a theory to measure the variance and covariance of probability distributions defined on the nodes of a graph, which takes into account the distance between nodes. Our approach generalizes the usual (co)variance to the setting of weighted graphs and retains many of its intuitive and desired properties. Interestingly, we find that a number of famous concepts in graph theory and network science can be reinterpreted in this setting as variances and covariances of particular distributions. As a particular application, we define the maximum variance problem on graphs with respect to the effective resistance distance, and characterize the solutions to this problem both numerically and theoretically. We show how the maximum variance distribution is concentrated on the boundary of the graph, and illustrate this in the case of random geometric graphs. Our theoretical results are supported by a number of experiments on a network of mathematical concepts, where we use the variance and covariance as analytical tools to study the (co-)occurrence of concepts in scientific papers with respect to the (network) relations between these concepts.

We derive the variance-covariance matrix of the sample covariance matrix (SCM) as well as its theoretical mean squared error (MSE) when sampling from complex elliptical distributions with finite fourth-order moments. We also derive the form of the variance-covariance matrix for any affine equivariant matrix-valued statistics. Finally, illustrative examples of the formulas are presented.

We consider the problem of predicting a response $Y$ from a set of covariates $X$ when test and training distributions differ. Since such differences may have causal explanations, we consider test distributions that emerge from interventions in a structural causal model, and focus on minimizing the worst-case risk. Causal regression models, which regress the response on its direct causes, remain unchanged under arbitrary interventions on the covariates, but they are not always optimal in the above sense. For example, for linear models and bounded interventions, alternative solutions have been shown to be minimax prediction optimal. We introduce the formal framework of distribution generalization that allows us to analyze the above problem in partially observed nonlinear models for both direct interventions on $X$ and interventions that occur indirectly via exogenous variables $A$. It takes into account that, in practice, minimax solutions need to be identified from data. Our framework allows us to characterize under which class of interventions the causal function is minimax optimal. We prove sufficient conditions for distribution generalization and present corresponding impossibility results. We propose a practical method, NILE, that achieves distribution generalization in a nonlinear IV setting with linear extrapolation. We prove consistency and present empirical results.

We develop an approach to risk minimization and stochastic optimization that provides a convex surrogate for variance, allowing near-optimal and computationally efficient trading between approximation and estimation error. Our approach builds off of techniques for distributionally robust optimization and Owen's empirical likelihood, and we provide a number of finite-sample and asymptotic results characterizing the theoretical performance of the estimator. In particular, we show that our procedure comes with certificates of optimality, achieving (in some scenarios) faster rates of convergence than empirical risk minimization by virtue of automatically balancing bias and variance. We give corroborating empirical evidence showing that in practice, the estimator indeed trades between variance and absolute performance on a training sample, improving out-of-sample (test) performance over standard empirical risk minimization for a number of classification problems.

北京阿比特科技有限公司