国产日黄色大片一区二区,亚洲国产最新AV片,日本公妇色中文字幕

The Fisher information matrix is a quantity of fundamental importance for information geometry and asymptotic statistics. In practice, it is widely used to quickly estimate the expected information available in a data set and guide experimental design choices. In many modern applications, it is intractable to analytically compute the Fisher information and Monte Carlo methods are used instead. The standard Monte Carlo method produces estimates of the Fisher information that can be biased when the Monte-Carlo noise is non-negligible. Most problematic is noise in the derivatives as this leads to an overestimation of the available constraining power, given by the inverse Fisher information. In this work we find another simple estimate that is oppositely biased and produces an underestimate of the constraining power. This estimator can either be used to give approximate bounds on the parameter constraints or can be combined with the standard estimator to give improved, approximately unbiased estimates. Both the alternative and the combined estimators are asymptotically unbiased so can be also used as a convergence check of the standard approach. We discuss potential limitations of these estimators and provide methods to assess their reliability. These methods accelerate the convergence of Fisher forecasts, as unbiased estimates can be achieved with fewer Monte Carlo samples, and so can be used to reduce the simulated data set size by several orders of magnitude.

相關內容

估(gu)計/估(gu)計量

關注 3

估計/估計量 · 過擬合 · 有偏 · 極大似然 · MoDELS ·

2023 年 7 月 3 日

Correction of overfitting bias in regression models

Emanuele Massa,Marianne Jonker,Kit Roes,Anthony Coolen

from arxiv, 6 figures, 38 pages including appendices

Regression analysis based on many covariates is becoming increasingly common. However, when the number of covariates $p$ is of the same order as the number of observations $n$, statistical protocols like maximum likelihood estimation of regression and nuisance parameters become unreliable due to overfitting. Overfitting typically leads to systematic estimation biases, and to increased estimator variances. It is crucial to be able to correctly quantify these effects, for inference and prediction purposes. In literature, several methods have been proposed to overcome overfitting bias or adjust estimates. The vast majority of these focus on the regression parameters only, either via empirical regularization methods or by expansion for small ratios $p/n$. This failure to correctly estimate also the nuisance parameters may lead to significant errors in outcome predictions. In this paper we use the leave one out method to derive the compact set of non-linear equations for the overfitting biases of maximum likelihood (ML) estimators in parametric regression models, as obtained previously using the replica method. We show that these equations enable one to correct regression and nuisance parameter estimators, and make them asymptotically unbiased. To illustrate the theory we performed simulation studies for multiple regression models. In all cases we find excellent agreement between theory and simulations.

泛函 · 最大后驗估計 · 噪聲 · 最大后驗 · 估計/估計量 ·

2023 年 7 月 3 日

Are minimizers of the Onsager-Machlup functional strong posterior modes?

Remo Kretschmann

In this work we connect two notions: That of the nonparametric mode of a probability measure, defined by asymptotic small ball probabilities, and that of the Onsager-Machlup functional, a generalized density also defined via asymptotic small ball probabilities. We show that in a separable Hilbert space setting and under mild conditions on the likelihood, modes of a Bayesian posterior distribution based upon a Gaussian prior exist and agree with the minimizers of its Onsager-Machlup functional and thus also with weak posterior modes. We apply this result to inverse problems and derive conditions on the forward mapping under which this variational characterization of posterior modes holds. Our results show rigorously that in the limit case of infinite-dimensional data corrupted by additive Gaussian or Laplacian noise, nonparametric maximum a posteriori estimation is equivalent to Tikhonov-Phillips regularization. In comparison with the work of Dashti, Law, Stuart, and Voss (2013), the assumptions on the likelihood are relaxed so that they cover in particular the important case of white Gaussian process noise. We illustrate our results by applying them to a severely ill-posed linear problem with Laplacian noise, where we express the maximum a posteriori estimator analytically and study its rate of convergence in the small noise limit.

估計/估計量 · 泛函 · 相關系數 · 情景 · 概率密度函數 ·

2023 年 7 月 2 日

hermiter: R package for Sequential Nonparametric Estimation

Michael Stephanou,Melvin Varughese

from arxiv, 33 pages plus references, 11 figures. Incorporates journal reviewer and editor comments. As appears in Computational Statistics

This article introduces the R package hermiter which facilitates estimation of univariate and bivariate probability density functions and cumulative distribution functions along with full quantile functions (univariate) and nonparametric correlation coefficients (bivariate) using Hermite series based estimators. The algorithms implemented in the hermiter package are particularly useful in the sequential setting (both stationary and non-stationary) and one-pass batch estimation setting for large data sets. In addition, the Hermite series based estimators are approximately mergeable allowing parallel and distributed estimation.

樣本 · 離散化 · 有偏 · 近似 · 情景 ·

2023 年 6 月 30 日

Proximal Langevin Sampling With Inexact Proximal Mapping

Matthias J. Ehrhardt,Lorenz Kuger,Carola-Bibiane Sch?nlieb

from arxiv, 24 pages, 6 figures

In order to solve tasks like uncertainty quantification or hypothesis tests in Bayesian imaging inverse problems, we often have to draw samples from the arising posterior distribution. For the usually log-concave but high-dimensional posteriors, Markov chain Monte Carlo methods based on time discretizations of Langevin diffusion are a popular tool. If the potential defining the distribution is non-smooth, these discretizations are usually of an implicit form leading to Langevin sampling algorithms that require the evaluation of proximal operators. For some of the potentials relevant in imaging problems this is only possible approximately using an iterative scheme. We investigate the behaviour of a proximal Langevin algorithm under the presence of errors in the evaluation of proximal mappings. We generalize existing non-asymptotic and asymptotic convergence results of the exact algorithm to our inexact setting and quantify the bias between the target and the algorithm's stationary distribution due to the errors. We show that the additional bias stays bounded for bounded errors and converges to zero for decaying errors in a strongly convex setting. We apply the inexact algorithm to sample numerically from the posterior of typical imaging inverse problems in which we can only approximate the proximal operator by an iterative scheme and validate our theoretical convergence results.

模型選擇 · MoDELS · 遷移學習 · Learning · 類別 ·

2023 年 6 月 30 日

Limits of Model Selection under Transfer Learning

Steve Hanneke,Samory Kpotufe,Yasaman Mahdaviyeh

from arxiv, Accepted for presentation at the Conference on Learning Theory (COLT) 2023

Theoretical studies on transfer learning or domain adaptation have so far focused on situations with a known hypothesis class or model; however in practice, some amount of model selection is usually involved, often appearing under the umbrella term of hyperparameter-tuning: for example, one may think of the problem of tuning for the right neural network architecture towards a target task, while leveraging data from a related source task. Now, in addition to the usual tradeoffs on approximation vs estimation errors involved in model selection, this problem brings in a new complexity term, namely, the transfer distance between source and target distributions, which is known to vary with the choice of hypothesis class. We present a first study of this problem, focusing on classification; in particular, the analysis reveals some remarkable phenomena: adaptive rates, i.e., those achievable with no distributional information, can be arbitrarily slower than oracle rates, i.e., when given knowledge on distances.

估計/估計量 · Networking · Learning · INFORMS · 深度學習 ·

2023 年 6 月 30 日

Uncertainty Estimation by Fisher Information-based Evidential Deep Learning

Danruo Deng,Guangyong Chen,Yang Yu,Furui Liu,Pheng-Ann Heng

from arxiv, ICML2023

Uncertainty estimation is a key factor that makes deep learning reliable in practical applications. Recently proposed evidential neural networks explicitly account for different uncertainties by treating the network's outputs as evidence to parameterize the Dirichlet distribution, and achieve impressive performance in uncertainty estimation. However, for high data uncertainty samples but annotated with the one-hot label, the evidence-learning process for those mislabeled classes is over-penalized and remains hindered. To address this problem, we propose a novel method, Fisher Information-based Evidential Deep Learning ($\mathcal{I}$-EDL). In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes. The generalization ability of our network is further improved by optimizing the PAC-Bayesian bound. As demonstrated empirically, our proposed method consistently outperforms traditional EDL-related algorithms in multiple uncertainty estimation tasks, especially in the more challenging few-shot classification settings.

統計量 · 散度 · 近似 · UniFormer · CASES ·

2023 年 6 月 29 日

Poisson and Gaussian approximations of the power divergence family of statistics

Fraser Daly

from arxiv, 14 pages

Consider the family of power divergence statistics based on $n$ trials, each leading to one of $r$ possible outcomes. This includes the log-likelihood ratio and Pearson's statistic as important special cases. It is known that in certain regimes (e.g., when $r$ is of order $n^2$ and the allocation is asymptotically uniform as $n\to\infty$) the power divergence statistic converges in distribution to a linear transformation of a Poisson random variable. We establish explicit error bounds in the Kolmogorov (or uniform) metric to complement this convergence result, which may be applied for any values of $n$, $r$ and the index parameter $\lambda$ for which such a finite-sample bound is meaningful. We further use this Poisson approximation result to derive error bounds in Gaussian approximation of the power divergence statistics.

近似 · 可約的 · 相互獨立的 · Performer · 線性的 ·

2023 年 6 月 29 日

Parallel approximation of the exponential of Hermitian matrices

Frédéric Hecht,Sidi-Mahmoud Kaber,Lucas Perrin,Alain Plagne,Julien Salomon

In this work, we consider a rational approximation of the exponential function to design an algorithm for computing matrix exponential in the Hermitian case. Using partial fraction decomposition, we obtain a parallelizable method, where the computation reduces to independent resolutions of linear systems. We analyze the effects of rounding errors on the accuracy of our algorithm. We complete this work with numerical tests showing the efficiency of our method and a comparison of its performances with Krylov algorithms.

INFORMS · 估計/估計量 · 均值 · 方差 · 平滑 ·

2023 年 6 月 28 日

Finite-Sample Symmetric Mean Estimation with Fisher Information Rate

Shivam Gupta,Jasper C. H. Lee,Eric Price

from arxiv, COLT 2023

The mean of an unknown variance-$\sigma^2$ distribution $f$ can be estimated from $n$ samples with variance $\frac{\sigma^2}{n}$ and nearly corresponding subgaussian rate. When $f$ is known up to translation, this can be improved asymptotically to $\frac{1}{n\mathcal I}$, where $\mathcal I$ is the Fisher information of the distribution. Such an improvement is not possible for general unknown $f$, but [Stone, 1975] showed that this asymptotic convergence $\textit{is}$ possible if $f$ is $\textit{symmetric}$ about its mean. Stone's bound is asymptotic, however: the $n$ required for convergence depends in an unspecified way on the distribution $f$ and failure probability $\delta$. In this paper we give finite-sample guarantees for symmetric mean estimation in terms of Fisher information. For every $f, n, \delta$ with $n > \log \frac{1}{\delta}$, we get convergence close to a subgaussian with variance $\frac{1}{n \mathcal I_r}$, where $\mathcal I_r$ is the $r$-$\textit{smoothed}$ Fisher information with smoothing radius $r$ that decays polynomially in $n$. Such a bound essentially matches the finite-sample guarantees in the known-$f$ setting.

估計/估計量 · 3D · 全 · 塑造 · 真實值 ·

2019 年 3 月 3 日

3D Hand Shape and Pose Estimation from a Single RGB Image

Liuhao Ge,Zhou Ren,Yuncheng Li,Zehao Xue,Yingying Wang,Jianfei Cai,Junsong Yuan

from arxiv, CVPR 2019 (Oral), //sites.google.com/site/geliuhaontu/home/cvpr2019

This work addresses a novel and challenging problem of estimating the full 3D hand shape and pose from a single RGB image. Most current methods in 3D hand analysis from monocular RGB images only focus on estimating the 3D locations of hand keypoints, which cannot fully express the 3D shape of hand. In contrast, we propose a Graph Convolutional Neural Network (Graph CNN) based method to reconstruct a full 3D mesh of hand surface that contains richer information of both 3D hand shape and pose. To train networks with full supervision, we create a large-scale synthetic dataset containing both ground truth 3D meshes and 3D poses. When fine-tuning the networks on real-world datasets without 3D ground truth, we propose a weakly-supervised approach by leveraging the depth map as a weak supervision in training. Through extensive evaluations on our proposed new datasets and two public datasets, we show that our proposed method can produce accurate and reasonable 3D hand mesh, and can achieve superior 3D hand pose estimation accuracy when compared with state-of-the-art methods.