亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

For a multivariate normal distribution, the sparsity of the covariance and precision matrices encodes complete information about independence and conditional independence properties. For general distributions, the covariance and precision matrices reveal correlations and so-called partial correlations between variables, but these do not, in general, have any correspondence with respect to independence properties. In this paper, we prove that, for a certain class of non-Gaussian distributions, these correspondences still hold, exactly for the covariance and approximately for the precision. The distributions -- sometimes referred to as "nonparanormal" -- are given by diagonal transformations of multivariate normal random variables. We provide several analytic and numerical examples illustrating these results.

相關內容

We derive a formula for the adjoint $\overline{A}$ of a square-matrix operation of the form $C=f(A)$, where $f$ is holomorphic in the neighborhood of each eigenvalue. We then apply the formula to derive closed-form expressions in particular cases of interest such as the case when we have a spectral decomposition $A=UDU^{-1}$, the spectrum cut-off $C=A_+$ and the Nearest Correlation Matrix routine. Finally, we explain how to simplify the computation of adjoints for regularized linear regression coefficients.

Motivated by establishing theoretical foundations for various manifold learning algorithms, we study the problem of Mahalanobis distance (MD), and the associated precision matrix, estimation from high-dimensional noisy data. By relying on recent transformative results in covariance matrix estimation, we demonstrate the sensitivity of \MD~and the associated precision matrix to measurement noise, determining the exact asymptotic signal-to-noise ratio at which MD fails, and quantifying its performance otherwise. In addition, for an appropriate loss function, we propose an asymptotically optimal shrinker, which is shown to be beneficial over the classical implementation of the MD, both analytically and in simulations. The result is extended to the manifold setup, where the nonlinear interaction between curvature and high-dimensional noise is taken care of. The developed solution is applied to study a multiscale reduction problem in the dynamical system analysis.

This paper addresses the task of estimating a covariance matrix under a patternless sparsity assumption. In contrast to existing approaches based on thresholding or shrinkage penalties, we propose a likelihood-based method that regularizes the distance from the covariance estimate to a symmetric sparsity set. This formulation avoids unwanted shrinkage induced by more common norm penalties and enables optimization of the resulting non-convex objective by solving a sequence of smooth, unconstrained subproblems. These subproblems are generated and solved via the proximal distance version of the majorization-minimization principle. The resulting algorithm executes rapidly, gracefully handles settings where the number of parameters exceeds the number of cases, yields a positive definite solution, and enjoys desirable convergence properties. Empirically, we demonstrate that our approach outperforms competing methods by several metrics across a suite of simulated experiments. Its merits are illustrated on an international migration dataset and a classic case study on flow cytometry. Our findings suggest that the marginal and conditional dependency networks for the cell signalling data are more similar than previously concluded.

A wide range of machine learning applications such as privacy-preserving learning, algorithmic fairness, and domain adaptation/generalization among others, involve learning \emph{invariant representations} of the data that aim to achieve two competing goals: (a) maximize information or accuracy with respect to a target response, and (b) maximize invariance or independence with respect to a set of protected features (e.g.\ for fairness, privacy, etc). Despite their wide applicability, theoretical understanding of the optimal tradeoffs -- with respect to accuracy, and invariance -- achievable by invariant representations is still severely lacking. In this paper, we provide precisely such an information-theoretic analysis of such tradeoffs under both classification and regression settings. We provide a geometric characterization of the accuracy and invariance achievable by any representation of the data; we term this feasible region the information plane. We provide a lower bound for this feasible region for the classification case, and an exact characterization for the regression case, which allows us to either bound or exactly characterize the Pareto optimal frontier between accuracy and invariance. Although our contributions are mainly theoretical, a key practical application of our results is in certifying the potential sub-optimality of any given representation learning algorithm for either classification or regression tasks. Our results shed new light on the fundamental interplay between accuracy and invariance, and may be useful in guiding the design of future representation learning algorithms.

The knowledge of channel covariance matrices is of paramount importance to the estimation of instantaneous channels and the design of beamforming vectors in multi-antenna systems. In practice, an abrupt change in channel covariance matrices may occur due to the change in the environment and the user location. Although several works have proposed efficient algorithms to estimate the channel covariance matrices after any change occurs, how to detect such a change accurately and quickly is still an open problem in the literature. In this paper, we focus on channel covariance change detection between a multi-antenna base station (BS) and a single-antenna user equipment (UE). To provide theoretical performance limit, we first propose a genie-aided change detector based on the log-likelihood ratio (LLR) test assuming the channel covariance matrix after change is known, and characterize the corresponding missed detection and false alarm probabilities. Then, this paper considers the practical case where the channel covariance matrix after change is unknown. The maximum likelihood (ML) estimation technique is used to predict the covariance matrix based on the received pilot signals over a certain number of coherence blocks, building upon which the LLR-based change detector is employed. Numerical results show that our proposed scheme can detect the change with low error probability even when the number of channel samples is small such that the estimation of the covariance matrix is not that accurate. This result verifies the possibility to detect the channel covariance change both accurately and quickly in practice.

The tube method or the volume-of-tube method approximates the tail probability of the maximum of a smooth Gaussian random field with zero mean and unit variance. This method evaluates the volume of a spherical tube about the index set, and then transforms it to the tail probability. In this study, we generalize the tube method to a case in which the variance is not constant. We provide the volume formula for a spherical tube with a non-constant radius in terms of curvature tensors, and the tail probability formula of the maximum of a Gaussian random field with inhomogeneous variance, as well as its Laplace approximation. In particular, the critical radius of the tube is generalized for evaluation of the asymptotic approximation error. As an example, we discuss the approximation of the largest eigenvalue distribution of the Wishart matrix with a non-identity matrix parameter. The Bonferroni method is the tube method when the index set is a finite set. We provide the formula for the asymptotic approximation error for the Bonferroni method when the variance is not constant.

In many regression settings the unknown coefficients may have some known structure, for instance they may be ordered in space or correspond to a vectorized matrix or tensor. At the same time, the unknown coefficients may be sparse, with many nearly or exactly equal to zero. However, many commonly used priors and corresponding penalties for coefficients do not encourage simultaneously structured and sparse estimates. In this paper we develop structured shrinkage priors that generalize multivariate normal, Laplace, exponential power and normal-gamma priors. These priors allow the regression coefficients to be correlated a priori without sacrificing elementwise sparsity or shrinkage. The primary challenges in working with these structured shrinkage priors are computational, as the corresponding penalties are intractable integrals and the full conditional distributions that are needed to approximate the posterior mode or simulate from the posterior distribution may be non-standard. We overcome these issues using a flexible elliptical slice sampling procedure, and demonstrate that these priors can be used to introduce structure while preserving sparsity.

The problem of constructing a simultaneous confidence band for the mean function of a locally stationary functional time series $ \{ X_{i,n} (t) \}_{i = 1, \ldots, n}$ is challenging as these bands can not be built on classical limit theory. On the one hand, for a fixed argument $t$ of the functions $ X_{i,n}$, the maximum absolute deviation between an estimate and the time dependent regression function exhibits (after appropriate standardization) an extreme value behaviour with a Gumbel distribution in the limit. On the other hand, for stationary functional data, simultaneous confidence bands can be built on classical central theorems for Banach space valued random variables and the limit distribution of the maximum absolute deviation is given by the sup-norm of a Gaussian process. As both limit theorems have different rates of convergence, they are not compatible, and a weak convergence result, which could be used for the construction of a confidence surface in the locally stationary case, does not exist. In this paper we propose new bootstrap methodology to construct a simultaneous confidence band for the mean function of a locally stationary functional time series, which is motivated by a Gaussian approximation for the maximum absolute deviation. We prove the validity of our approach by asymptotic theory, demonstrate good finite sample properties by means of a simulation study and illustrate its applicability analyzing a data example.

Discrete random structures are important tools in Bayesian nonparametrics and the resulting models have proven effective in density estimation, clustering, topic modeling and prediction, among others. In this paper, we consider nested processes and study the dependence structures they induce. Dependence ranges between homogeneity, corresponding to full exchangeability, and maximum heterogeneity, corresponding to (unconditional) independence across samples. The popular nested Dirichlet process is shown to degenerate to the fully exchangeable case when there are ties across samples at the observed or latent level. To overcome this drawback, inherent to nesting general discrete random measures, we introduce a novel class of latent nested processes. These are obtained by adding common and group-specific completely random measures and, then, normalising to yield dependent random probability measures. We provide results on the partition distributions induced by latent nested processes, and develop an Markov Chain Monte Carlo sampler for Bayesian inferences. A test for distributional homogeneity across groups is obtained as a by product. The results and their inferential implications are showcased on synthetic and real data.

In this paper we introduce a covariance framework for the analysis of EEG and MEG data that takes into account observed temporal stationarity on small time scales and trial-to-trial variations. We formulate a model for the covariance matrix, which is a Kronecker product of three components that correspond to space, time and epochs/trials, and consider maximum likelihood estimation of the unknown parameter values. An iterative algorithm that finds approximations of the maximum likelihood estimates is proposed. We perform a simulation study to assess the performance of the estimator and investigate the influence of different assumptions about the covariance factors on the estimated covariance matrix and on its components. Apart from that, we illustrate our method on real EEG and MEG data sets. The proposed covariance model is applicable in a variety of cases where spontaneous EEG or MEG acts as source of noise and realistic noise covariance estimates are needed for accurate dipole localization, such as in evoked activity studies, or where the properties of spontaneous EEG or MEG are themselves the topic of interest, such as in combined EEG/fMRI experiments in which the correlation between EEG and fMRI signals is investigated.

北京阿比特科技有限公司