日本三级网站在线播放_YY6080午夜国产高清理论_2021免费国内精品在拍自线_国产夜色福利影院_亚洲国产色视频在线观看_亚洲午夜久久久九九精品_日日丝袜美女一区二区三区

We establish connections between invariant theory and maximum likelihood estimation for discrete statistical models. We show that norm minimization over a torus orbit is equivalent to maximum likelihood estimation in log-linear models. We use notions of stability under a torus action to characterize the existence of the maximum likelihood estimate, and discuss connections to scaling algorithms.

相關內容

最大似然估計

關注 0

在(zai)(zai)統(tong)計(ji)(ji)學中，最(zui)大似(si)然估計(ji)(ji)(maximum likelihood estimation, MLE)是通過最(zui)大化(hua)似(si)然函數(shu)估計(ji)(ji)概率分布參(can)數(shu)的(de)一種方法(fa)，使觀測數(shu)據(ju)在(zai)(zai)假設(she)的(de)統(tong)計(ji)(ji)模型下最(zui)有可能。參(can)數(shu)空間中使似(si)然函數(shu)最(zui)大化(hua)的(de)點稱為最(zui)大似(si)然估計(ji)(ji)。最(zui)大似(si)然邏(luo)輯既直觀又(you)靈活，因(yin)此(ci)該方法(fa)已成為統(tong)計(ji)(ji)推斷的(de)主要手(shou)段(duan)。

估計/估計量 · 優化器 · 秩 · 奇異的 · SimPLe ·

2021 年 11 月 26 日

Optimal Estimation of Schatten Norms of a rectangular Matrix

Solène Thépaut,Nicolas Verzelen

from arxiv, 67 pages

We consider the twin problems of estimating the effective rank and the Schatten norms $\|{\bf A}\|_{s}$ of a rectangular $p\times q$ matrix ${\bf A}$ from noisy observations. When $s$ is an even integer, we introduce a polynomial-time estimator of $\|{\bf A}\|_s$ that achieves the minimax rate $(pq)^{1/4}$. Interestingly, this optimal rate does not depend on the underlying rank of the matrix. When $s$ is not an even integer, the optimal rate is much slower. A simple thresholding estimator of the singular values achieves the rate $(q\wedge p)(pq)^{1/4}$, which turns out to be optimal up to a logarithmic multiplicative term. The tight minimax rate is achieved by a more involved polynomial approximation method. This allows us to build estimators for a class of effective rank indices. As a byproduct, we also characterize the minimax rate for estimating the sequence of singular values of a matrix.

估計/估計量 · 統計量 · MoDELS · 推斷 · 視覺識別系統 ·

2021 年 11 月 25 日

Variational Gibbs inference for statistical model estimation from incomplete data

Vaidotas Simkus,Benjamin Rhodes,Michael U. Gutmann

Statistical models are central to machine learning with broad applicability across a range of downstream tasks. The models are typically controlled by free parameters that are estimated from data by maximum-likelihood estimation. However, when faced with real-world datasets many of the models run into a critical issue: they are formulated in terms of fully-observed data, whereas in practice the datasets are plagued with missing data. The theory of statistical model estimation from incomplete data is conceptually similar to the estimation of latent-variable models, where powerful tools such as variational inference (VI) exist. However, in contrast to standard latent-variable models, parameter estimation with incomplete data often requires estimating exponentially-many conditional distributions of the missing variables, hence making standard VI methods intractable. We address this gap by introducing variational Gibbs inference (VGI), a new general-purpose method to estimate the parameters of statistical models from incomplete data. We validate VGI on a set of synthetic and real-world estimation tasks, estimating important machine learning models, VAEs and normalising flows, from incomplete data. The proposed method, whilst general-purpose, achieves competitive or better performance than existing model-specific estimation methods.

估計/估計量 · 等變 · 推斷 · GROUP · 可約的 ·

2021 年 11 月 25 日

Group equivariant neural posterior estimation

Maximilian Dax,Stephen R. Green,Jonathan Gair,Michael Deistler,Bernhard Sch?lkopf,Jakob H. Macke

from arxiv, 13+11 pages, 5+8 figures

Simulation-based inference with conditional neural density estimators is a powerful approach to solving inverse problems in science. However, these methods typically treat the underlying forward model as a black box, with no way to exploit geometric properties such as equivariances. Equivariances are common in scientific models, however integrating them directly into expressive inference networks (such as normalizing flows) is not straightforward. We here describe an alternative method to incorporate equivariances under joint transformations of parameters and data. Our method -- called group equivariant neural posterior estimation (GNPE) -- is based on self-consistently standardizing the "pose" of the data while estimating the posterior over parameters. It is architecture-independent, and applies both to exact and approximate equivariances. As a real-world application, we use GNPE for amortized inference of astrophysical binary black hole systems from gravitational-wave observations. We show that GNPE achieves state-of-the-art accuracy while reducing inference times by three orders of magnitude.

估計/估計量 · Weight · 似然 · 規范化的 · 有偏 ·

2021 年 11 月 25 日

Biased-sample empirical likelihood weighting: an alternative to inverse probability weighting

Yukun Liu,Yan Fan

from arxiv, 37 pages, 4 figures

Inverse probability weighting (IPW) is widely used in many areas when data are subject to unrepresentativeness, missingness, or selection bias. An inevitable challenge with the use of IPW is that the IPW estimator can be remarkably unstable if some probabilities are very close to zero. To overcome this problem, at least three remedies have been developed in the literature: stabilizing, thresholding, and trimming. However the final estimators are still IPW type estimators, and inevitably inherit certain weaknesses of the naive IPW estimator: they may still be unstable or biased. We propose a biased-sample empirical likelihood weighting (ELW) method to serve the same general purpose as IPW, while completely overcoming the instability of IPW-type estimators by circumventing the use of inverse probabilities. The ELW weights are always well defined and easy to implement. We show theoretically that the ELW estimator is asymptotically normal and more efficient than the IPW estimator and its stabilized version for missing data problems and unequal probability sampling without replacement. Its asymptotic normality is also established under unequal probability sampling with replacement. Our simulation results and a real data analysis indicate that the ELW estimator is shift-equivariant, nearly unbiased, and usually outperforms the IPW-type estimators in terms of mean square error.

相互獨立的 · 近似 · INFORMS · 極小點 · 易處理的 ·

2021 年 11 月 24 日

On the Exponential Approximation of Type II Error Probability of Distributed Test of Independence

Sebastian Espinosa,Jorge F. Silva,Pablo Piantanida

This paper studies distributed binary test of statistical independence under communication (information bits) constraints. While testing independence is very relevant in various applications, distributed independence test is particularly useful for event detection in sensor networks where data correlation often occurs among observations of devices in the presence of a signal of interest. By focusing on the case of two devices because of their tractability, we begin by investigating conditions on Type I error probability restrictions under which the minimum Type II error admits an exponential behavior with the sample size. Then, we study the finite sample-size regime of this problem. We derive new upper and lower bounds for the gap between the minimum Type II error and its exponential approximation under different setups, including restrictions imposed on the vanishing Type I error probability. Our theoretical results shed light on the sample-size regimes at which approximations of the Type II error probability via error exponents became informative enough in the sense of predicting well the actual error probability. We finally discuss an application of our results where the gap is evaluated numerically, and we show that exponential approximations are not only tractable but also a valuable proxy for the Type II probability of error in the finite-length regime.

Principle · INFORMS · 聯合分布 · 約束 · CASE ·

2021 年 11 月 24 日

Causal versions of Maximum Entropy and Principle of Insufficient Reason

Dominik Janzing

from arxiv, 16 pages

The Principle of Insufficient Reason (PIR) assigns equal probabilities to each alternative of a random experiment whenever there is no reason to prefer one over the other. The Maximum Entropy Principle (MaxEnt) generalizes PIR to the case where statistical information like expectations are given. It is known that both principles result in paradoxical probability updates for joint distributions of cause and effect. This is because constraints on the conditional P(effect|cause) result in changes of P(cause) that assign higher probability to those values of the cause that offer more options for the effect, suggesting "intentional behaviour". Earlier work therefore suggested sequentially maximizing (conditional) entropy according to the causal order, but without further justification apart from plausibility on toy examples. We justify causal modifications of PIR and MaxEnt by separating constraints into restrictions for the cause and restrictions for the mechanism that generates the effect from the cause. We further sketch why Causal PIR also entails "Information Geometric Causal Inference". We briefly discuss problems of generalizing the causal version of MaxEnt to arbitrary causal DAGs.

估計/估計量 · 方差 · 后驗推斷 · MoDELS · 線性判別分析 ·

2021 年 11 月 24 日

An MAP Estimation for Between-Class Variance

Jiao Han,Yunqi Cai,Lantian Li,Guanyu Li,Dong Wang

Probabilistic linear discriminant analysis (PLDA) has been widely used in open-set verification tasks, such as speaker verification. A potential issue of this model is that the training set often contains limited number of classes, which makes the estimation for the between-class variance unreliable. This unreliable estimation often leads to degraded generalization. In this paper, we present an MAP estimation for the between-class variance, by employing an Inverse-Wishart prior. A key problem is that with hierarchical models such as PLDA, the prior is placed on the variance of class means while the likelihood is based on class members, which makes the posterior inference intractable. We derive a simple MAP estimation for such a model, and test it in both PLDA scoring and length normalization. In both cases, the MAP-based estimation delivers interesting performance improvement.

似然 · 估計/估計量 · 最大似然估計 · 極大似然 · MoDELS ·

2018 年 9 月 24 日

Implicit Maximum Likelihood Estimation

Ke Li,Jitendra Malik

from arxiv, 21 pages, 4 figures. In the interest of promoting discussion, we make the reviews available at //people.eecs.berkeley.edu/~ke.li/papers/imle_reviews.pdf

Implicit probabilistic models are models defined naturally in terms of a sampling procedure and often induces a likelihood function that cannot be expressed explicitly. We develop a simple method for estimating parameters in implicit models that does not require knowledge of the form of the likelihood function or any derived quantities, but can be shown to be equivalent to maximizing likelihood under some conditions. Our result holds in the non-asymptotic parametric setting, where both the capacity of the model and the number of data examples are finite. We also demonstrate encouraging experimental results.

MoDELS · SimPLe · CC · 模型評估 · 高斯混合（模型） ·

2018 年 2 月 24 日

The Search Problem in Mixture Models

Avik Ray,Joe Neeman,Sujay Sanghavi,Sanjay Shakkottai

We consider the task of learning the parameters of a {\em single} component of a mixture model, for the case when we are given {\em side information} about that component, we call this the "search problem" in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain parameter estimates with greater accuracy, and also improved computation complexity than existing moment based mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real data sets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.

Performer · 估計/估計量 · 經驗風險最小化 · 經驗風險 · 方差 ·

2017 年 12 月 14 日

Variance-based regularization with convex objectives

John Duchi,Hongseok Namkoong

We develop an approach to risk minimization and stochastic optimization that provides a convex surrogate for variance, allowing near-optimal and computationally efficient trading between approximation and estimation error. Our approach builds off of techniques for distributionally robust optimization and Owen's empirical likelihood, and we provide a number of finite-sample and asymptotic results characterizing the theoretical performance of the estimator. In particular, we show that our procedure comes with certificates of optimality, achieving (in some scenarios) faster rates of convergence than empirical risk minimization by virtue of automatically balancing bias and variance. We give corroborating empirical evidence showing that in practice, the estimator indeed trades between variance and absolute performance on a training sample, improving out-of-sample (test) performance over standard empirical risk minimization for a number of classification problems.