人人操人人莫人人草,搜索亚洲各国的性爱视频网站,窝窝午夜看片成人精品下载,上海一级特黄大片

Conventional information-theoretic quantities assume access to probability distributions. Estimating such distributions is not trivial. Here, we consider function-based formulations of cross entropy that sidesteps this a priori estimation requirement. We propose three measures of R\'enyi's $\alpha$-cross-entropies in the setting of reproducing-kernel Hilbert spaces. Each measure has its appeals. We prove that we can estimate these measures in an unbiased, non-parametric, and minimax-optimal way. We do this via sample-constructed Gram matrices. This yields matrix-based estimators of R\'enyi's $\alpha$-cross-entropies. These estimators satisfy all of the axioms that R\'enyi established for divergences. Our cross-entropies can thus be used for assessing distributional differences. They are also appropriate for handling high-dimensional distributions, since the convergence rate of our estimator is independent of the sample dimensionality. Python code for implementing these measures can be found at //github.com/isledge/MBRCE

相關內容

估計/估計量

關注 3

估計/估計量 · MoDELS · Taxonomy · Performer · state-of-the-art ·

2021 年 11 月 15 日

Estimating Individual Treatment Effects using Non-Parametric Regression Models: a Review

Alberto Caron,Gianluca Baio,Ioanna Manolopoulou

from arxiv, 24 pages, 7 figures

Large observational data are increasingly available in disciplines such as health, economic and social sciences, where researchers are interested in causal questions rather than prediction. In this paper, we examine the problem of estimating heterogeneous treatment effects using non-parametric regression-based methods, starting from an empirical study aimed at investigating the effect of participation in school meal programs on health indicators. Firstly, we introduce the setup and the issues related to conducting causal inference with observational or non-fully randomized data, and how these issues can be tackled with the help of statistical learning tools. Then, we review and develop a unifying taxonomy of the existing state-of-the-art frameworks that allow for individual treatment effects estimation via non-parametric regression models. After presenting a brief overview on the problem of model selection, we illustrate the performance of some of the methods on three different simulated studies. We conclude by demonstrating the use of some of the methods on an empirical analysis of the school meal program data.

正則化項 · Oracle · 秩 · 損失 · 廣義線性模型 ·

2021 年 11 月 15 日

Low-rank matrix recovery with non-quadratic loss: projected gradient method and regularity projection oracle

Lijun Ding,Yuqian Zhang,Yudong Chen

from arxiv, 30 pages and 3 figures

Existing results for low-rank matrix recovery largely focus on quadratic loss, which enjoys favorable properties such as restricted strong convexity/smoothness (RSC/RSM) and well conditioning over all low rank matrices. However, many interesting problems involve more general, non-quadratic losses, which do not satisfy such properties. For these problems, standard nonconvex approaches such as rank-constrained projected gradient descent (a.k.a. iterative hard thresholding) and Burer-Monteiro factorization could have poor empirical performance, and there is no satisfactory theory guaranteeing global and fast convergence for these algorithms. In this paper, we show that a critical component in provable low-rank recovery with non-quadratic loss is a regularity projection oracle. This oracle restricts iterates to low-rank matrices within an appropriate bounded set, over which the loss function is well behaved and satisfies a set of approximate RSC/RSM conditions. Accordingly, we analyze an (averaged) projected gradient method equipped with such an oracle, and prove that it converges globally and linearly. Our results apply to a wide range of non-quadratic low-rank estimation problems including one bit matrix sensing/completion, individualized rank aggregation, and more broadly generalized linear models with rank constraints.

估計/估計量 · 統計量 · 線性的 · 相互獨立的 · 平滑 ·

2021 年 11 月 15 日

Properties of linear spectral statistics of frequency-smoothed estimated spectral coherence matrix of high-dimensional Gaussian time series

Philippe Loubaton,Alexis Rosuel

from arxiv, arXiv admin note: substantial text overlap with arXiv:2007.08806

The asymptotic behaviour of Linear Spectral Statistics (LSS) of the smoothed periodogram estimator of the spectral coherency matrix of a complex Gaussian high-dimensional time series $(\y_n)_{n \in \mathbb{Z}}$ with independent components is studied under the asymptotic regime where the sample size $N$ converges towards $+\infty$ while the dimension $M$ of $\y$ and the smoothing span of the estimator grow to infinity at the same rate in such a way that $\frac{M}{N} \rightarrow 0$. It is established that, at each frequency, the estimated spectral coherency matrix is close from the sample covariance matrix of an independent identically $\mathcal{N}_{\mathbb{C}}(0,\I_M)$ distributed sequence, and that its empirical eigenvalue distribution converges towards the Marcenko-Pastur distribution. This allows to conclude that each LSS has a deterministic behaviour that can be evaluated explicitly. Using concentration inequalities, it is shown that the order of magnitude of the supremum over the frequencies of the deviation of each LSS from its deterministic approximation is of the order of $\frac{1}{M} + \frac{\sqrt{M}}{N}+ (\frac{M}{N})^{3}$ where $N$ is the sample size. Numerical simulations supports our results.

估計/估計量 · INFORMS · 極大似然 · 協方差矩陣 · 似然 ·

2021 年 11 月 15 日

Some results on maximum likelihood of incomplete data: finite sample properties, consistent sandwich estimator of covariance matrix and recursive algorithms

Budhi Surya

from arxiv, 22 pages, 3 figures

This paper presents some new results on maximum likelihood of incomplete data. Finite sample properties of conditional observed information matrices are established. In particular, they possess the same Loewner partial ordering properties as the expected information matrices do. In its new form, the observed Fisher information (OFI) simplifies conditional expectation of outer product of the complete-data score function appearing in the Louis (1982) general matrix formula. It verifies positive definiteness and consistency to the expected Fisher information as the sample size increases. Furthermore, it shows a resulting information loss presented in the incomplete data. For this reason, the OFI may not be the right (consistent and efficient) estimator to derive the standard error (SE) of maximum likelihood estimates (MLE) for incomplete data. A sandwich estimator of covariance matrix is developed to provide consistent and efficient estimates of SE. The proposed sandwich estimator coincides with the Huber sandwich estimator for model misspecification under complete data (Huber, 1967; Freedman, 2006; Little and Rubin, 2020). However, in contrast to the latter, the new estimator does not involve OFI which notably gives an appealing feature for application. Recursive algorithms for the MLE, the observed information and the sandwich estimator are presented. Application to parameter estimation of a regime switching conditional Markov jump process is considered to verify the results. The recursive equations for the inverse OFI generalizes the algorithm of Hero and Fessler (1994). The simulation study confirms that the MLEs are accurate and consistent having asymptotic normality. The sandwich estimator produces standard error of the MLE close to their analytic values compared to those overestimated by the OFI.

奇異的 · 估計/估計量 · 優化器 · Frobenius 范數 · 奇異向量 ·

2021 年 11 月 13 日

Optimal cleaning for singular values of cross-covariance matrices

Florent Benaych-Georges,Jean-Philippe Bouchaud,Marc Potters

from arxiv, 36 pages, 6 figures. In v2: details added in some proofs and remark about estimator convergence added in the Optimality section (Sect. 1.3). In v3: added details about the global effect of error in estimating ideally cleaned singular values. In v4: typos corrected and comments added. In v5: new numerical results where added, as well as a short discussion on computational complexity

We give a new algorithm for the estimation of the cross-covariance matrix $\mathbb{E} XY'$ of two large dimensional signals $X\in\mathbb{R}^n$, $Y\in \mathbb{R}^p$ in the context where the number $T$ of observations of the pair $(X,Y)$ is itself large, but with $T/n$ and $T/p$ not supposed to be small. In the asymptotic regime where $n,p,T$ are large, with high probability, this algorithm is optimal for the Frobenius norm among rotationally invariant estimators, i.e. estimators derived from the empirical estimator by cleaning the singular values, while letting singular vectors unchanged.

分解的 · 可辨認的 · 稀疏 · 傅立葉變換 · 離散化 ·

2021 年 11 月 12 日

Hierarchical Identifiability in Multi-layer Sparse Matrix Factorization

Léon Zheng,Elisa Riccietti,Rémi Gribonval

Many well-known matrices $Z$ are associated to fast transforms corresponding to factorizations of the form $Z = X^J \ldots X^1$, where each factor $X^\ell$ is sparse and possibly structured. This paper investigates essential uniqueness of such factorizations. Our first main contribution is to prove that any $N \times N$ matrix having the so-called butterfly structure admits a unique factorization into $J$ butterfly factors (where $N = 2^J$), and that the factors can be recovered by a hierarchical factorization method. This contrasts with existing approaches which fit the product of the butterfly factors to a given matrix via gradient descent. The proposed method can be applied in particular to retrieve the factorizations of the Hadamard or the Discrete Fourier Transform matrices of size $2^J$. Computing such factorizations costs $\mathcal{O}(N^2)$, which is of the order of dense matrix-vector multiplication, while the obtained factorizations enable fast $\mathcal{O}(N \log N)$ matrix-vector multiplications. This hierarchical identifiability property relies on a simple identifiability condition in the two-layer and fixed-support setting that was recently established. While the butterfly structure corresponds to a fixed prescribed support for each factor, our second contribution is to obtain identifiability results with more general families of allowed sparsity patterns, taking into account permutation ambiguities when they are unavoidable. Typically, we show through the hierarchical paradigm that the butterfly factorization of the Discrete Fourier Transform matrix of size $2^J$ admits a unique sparse factorization into $J$ factors, when enforcing only $2$-sparsity by column and a block-diagonal structure on each factor.

規范化的 · 估計/估計量 · 樣本復雜度 · Kronecker積 · 協方差矩陣 ·

2021 年 11 月 11 日

Near optimal sample complexity for matrix and tensor normal models via geodesic convexity

Cole Franks,Rafael Oliveira,Akshay Ramachandran,Michael Walter

from arxiv, Measured computation time on more instances

The matrix normal model, the family of Gaussian matrix-variate distributions whose covariance matrix is the Kronecker product of two lower dimensional factors, is frequently used to model matrix-variate data. The tensor normal model generalizes this family to Kronecker products of three or more factors. We study the estimation of the Kronecker factors of the covariance matrix in the matrix and tensor models. We show nonasymptotic bounds for the error achieved by the maximum likelihood estimator (MLE) in several natural metrics. In contrast to existing bounds, our results do not rely on the factors being well-conditioned or sparse. For the matrix normal model, all our bounds are minimax optimal up to logarithmic factors, and for the tensor normal model our bound for the largest factor and overall covariance matrix are minimax optimal up to constant factors provided there are enough samples for any estimator to obtain constant Frobenius error. In the same regimes as our sample complexity bounds, we show that an iterative procedure to compute the MLE known as the flip-flop algorithm converges linearly with high probability. Our main tool is geodesic strong convexity in the geometry on positive-definite matrices induced by the Fisher information metric. This strong convexity is determined by the expansion of certain random quantum channels. We also provide numerical evidence that combining the flip-flop algorithm with a simple shrinkage estimator can improve performance in the undersampled regime.

秩 · MoDELS · 優化器 · 奇異值分解 · 列 ·

2018 年 10 月 18 日

Testing Matrix Rank, Optimally

Maria-Florina Balcan,Yi Li,David P. Woodruff,Hongyang Zhang

from arxiv, 51 pages. To appear in SODA 2019

We show that for the problem of testing if a matrix $A \in F^{n \times n}$ has rank at most $d$, or requires changing an $\epsilon$-fraction of entries to have rank at most $d$, there is a non-adaptive query algorithm making $\widetilde{O}(d^2/\epsilon)$ queries. Our algorithm works for any field $F$. This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix. Our algorithm is the first such algorithm which does not read a submatrix, and instead reads a carefully selected non-adaptive pattern of entries in rows and columns of $A$. We complement our algorithm with a matching query complexity lower bound for non-adaptive testers over any field. We also give tight bounds of $\widetilde{\Theta}(d^2)$ queries in the sensing model for which query access comes in the form of $\langle X_i, A\rangle:=tr(X_i^\top A)$; perhaps surprisingly these bounds do not depend on $\epsilon$. We next develop a novel property testing framework for testing numerical properties of a real-valued matrix $A$ more generally, which includes the stable rank, Schatten-$p$ norms, and SVD entropy. Specifically, we propose a bounded entry model, where $A$ is required to have entries bounded by $1$ in absolute value. We give upper and lower bounds for a wide range of problems in this model, and discuss connections to the sensing model above.

似然 · 估計/估計量 · 最大似然估計 · 極大似然 · MoDELS ·

2018 年 9 月 24 日

Implicit Maximum Likelihood Estimation

Ke Li,Jitendra Malik

from arxiv, 21 pages, 4 figures. In the interest of promoting discussion, we make the reviews available at //people.eecs.berkeley.edu/~ke.li/papers/imle_reviews.pdf

Implicit probabilistic models are models defined naturally in terms of a sampling procedure and often induces a likelihood function that cannot be expressed explicitly. We develop a simple method for estimating parameters in implicit models that does not require knowledge of the form of the likelihood function or any derived quantities, but can be shown to be equivalent to maximizing likelihood under some conditions. Our result holds in the non-asymptotic parametric setting, where both the capacity of the model and the number of data examples are finite. We also demonstrate encouraging experimental results.

近似 · INFORMS · SimPLe · 秩 · 線性的 ·

2018 年 1 月 2 日

Practical sketching algorithms for low-rank matrix approximation

Joel A. Tropp,Alp Yurtsever,Madeleine Udell,Volkan Cevher

This paper describes a suite of algorithms for constructing low-rank approximations of an input matrix from a random linear image of the matrix, called a sketch. These methods can preserve structural properties of the input matrix, such as positive-semidefiniteness, and they can produce approximations with a user-specified rank. The algorithms are simple, accurate, numerically stable, and provably correct. Moreover, each method is accompanied by an informative error bound that allows users to select parameters a priori to achieve a given approximation quality. These claims are supported by numerical experiments with real and synthetic data.