丰满人妻被公侵犯高清版_国产日韩精品全集在线观看_国产片理论在线电影网站_久久国内视频免费观看_一级A一级A爰片免费啪啪外国_免费看日本无遮挡色视频_婷婷六月国产在线

We study the distribution of the maximum likelihood estimate (MLE) in high-dimensional logistic models, extending the recent results from Sur (2019) to the case where the Gaussian covariates may have an arbitrary covariance structure. We prove that in the limit of large problems holding the ratio between the number $p$ of covariates and the sample size $n$ constant, every finite list of MLE coordinates follows a multivariate normal distribution. Concretely, the $j$th coordinate $\hat {\beta}_j$ of the MLE is asymptotically normally distributed with mean $\alpha_\star \beta_j$ and standard deviation $\sigma_\star/\tau_j$; here, $\beta_j$ is the value of the true regression coefficient, and $\tau_j$ the standard deviation of the $j$th predictor conditional on all the others. The numerical parameters $\alpha_\star > 1$ and $\sigma_\star$ only depend upon the problem dimensionality $p/n$ and the overall signal strength, and can be accurately estimated. Our results imply that the MLE's magnitude is biased upwards and that the MLE's standard deviation is greater than that predicted by classical theory. We present a series of experiments on simulated and real data showing excellent agreement with the theory.

相關內容

極大似然估計

關注 5

極大似然估計方法（Maximum Likelihood Estimate，MLE）也稱為最大概似估計或最大似然估計，是求估計的另一種方法，最大概似是1821年首先由德國數學家高斯（C. F. Gauss）提出，但是這個方法通常被歸功于英國的統計學家羅納德·費希爾（R. A. Fisher）它是建立在極大似然原理的基礎上的一個統計方法，極大似然原理的直觀想法是，一個隨機試驗如有若干個可能的結果A，B，C，... ，若在一次試驗中，結果A出現了，那么可以認為實驗條件對A的出現有利，也即出現的概率P(A)較大。極大似然原理的直觀想法我們用下面例子說明。設甲箱中有99個白球，1個黑球；乙箱中有1個白球．99個黑球。現隨機取出一箱，再從抽取的一箱中隨機取出一球，結果是黑球，這一黑球從乙箱抽取的概率比從甲箱抽取的概率大得多，這時我們自然更多地相信這個黑球是取自乙箱的。一般說來，事件A發生的概率與某一未知參數theta有關， theta取值不同，則事件A發生的概率P(A/theta)也不同，當我們在一次試驗中事件A發生了，則認為此時的theta值應是t的一切可能取值中使P(A/theta)達到最大的那一個，極大似然估計法就是要選取這樣的t值作為參數t的估計值，使所選取的樣本在被選的總體中出現的可能性為最大。

估計/估計量 · motivation · 錯誤率 · 泛函 · 樣本 ·

2023 年 3 月 3 日

Rate adaptive estimation of the center of a symmetric distribution

Yu-Chun Kao,Min Xu,Cun-Hui Zhang

from arxiv, 32 pages; 7 figures

Given univariate random variables $Y_1, \ldots, Y_n$ with the $\text{Uniform}(\theta_0 - 1, \theta_0 + 1)$ distribution, the sample midrange $\frac{Y_{(n)}+Y_{(1)}}{2}$ is the MLE for $\theta_0$ and estimates $\theta_0$ with error of order $1/n$, which is much smaller compared with the $1/\sqrt{n}$ error rate of the usual sample mean estimator. However, the sample midrange performs poorly when the data has say the Gaussian $N(\theta_0, 1)$ distribution, with an error rate of $1/\sqrt{\log n}$. In this paper, we propose an estimator of the location $\theta_0$ with a rate of convergence that can, in many settings, adapt to the underlying distribution which we assume to be symmetric around $\theta_0$ but is otherwise unknown. When the underlying distribution is compactly supported, we show that our estimator attains a rate of convergence of $n^{-\frac{1}{\alpha}}$ up to polylog factors, where the rate parameter $\alpha$ can take on any value in $(0, 2]$ and depends on the moments of the underlying distribution. Our estimator is formed by the $\ell^\gamma$-center of the data, for a $\gamma\geq2$ chosen in a data-driven way -- by minimizing a criterion motivated by the asymptotic variance. Our approach can be directly applied to the regression setting where $\theta_0$ is a function of observed features and motivates the use of $\ell^\gamma$ loss function for $\gamma > 2$ in certain settings.

泛函 · state-of-the-art · 樣本 · 評論員 · Analysis ·

2023 年 3 月 3 日

VRA: Out-of-Distribution Detection with variational rectified activations

Mingyu Xu,Zheng Lian

Detecting out-of-distribution (OOD) data is critical to building reliable machine learning systems in the open world. Among the existing OOD detection methods, ReAct is famous for its simplicity and efficiency, and has good theoretical analysis. The gap between ID data and OOD data is enlarged by clipping the larger activation value. But the question is, is this operation optimal? Is there a better way to expand the spacing between ID samples and OOD samples in theory? Driven by these questions, we view the optimal activation function modification from the perspective of functional extremum and propose the Variational Recified Acitvations (VRA) method. In order to make our method easy to practice, we further propose several VRA variants. To verify the effectiveness of our method, we conduct experiments on many benchmark datasets. Experimental results demonstrate that our method outperforms existing state-of-the-art approaches. Meanwhile, our method is easy to implement and does not require additional OOD data or fine-tuning process. We can realize OOD detection in only one forward pass.

Markov · MoDELS · 估計/估計量 · 推斷 · 稀疏 ·

2023 年 3 月 3 日

Sparse Markov Models for High-dimensional Inference

Guilherme Ost,Daniel Takahashi

Finite order Markov models are theoretically well-studied models for dependent discrete data. Despite their generality, application in empirical work when the order is large is rare. Practitioners avoid using higher order Markov models because (1) the number of parameters grow exponentially with the order and (2) the interpretation is often difficult. Mixture of transition distribution models (MTD) were introduced to overcome both limitations. MTD represent higher order Markov models as a convex mixture of single step Markov chains, reducing the number of parameters and increasing the interpretability. Nevertheless, in practice, estimation of MTD models with large orders are still limited because of curse of dimensionality and high algorithm complexity. Here, we prove that if only few lags are relevant we can consistently and efficiently recover the lags and estimate the transition probabilities of high-dimensional MTD models. The key innovation is a recursive procedure for the selection of the relevant lags of the model. Our results are based on (1) a new structural result of the MTD and (2) an improved martingale concentration inequality. We illustrate our method using simulations and a weather data.

Analysis · 線性回歸 · 線性的 · 嶺回歸 · Performer ·

2023 年 3 月 2 日

High-dimensional analysis of double descent for linear regression with random projections

Francis Bach

We consider linear regression problems with a varying number of random projections, where we provably exhibit a double descent curve for a fixed prediction problem, with a high-dimensional analysis based on random matrix theory. We first consider the ridge regression estimator and re-interpret earlier results using classical notions from non-parametric statistics, namely degrees of freedom, also known as effective dimensionality. In particular, we show that the random design performance of ridge regression with a specific regularization parameter matches the classical bias and variance expressions coming from the easier fixed design analysis but for another larger implicit regularization parameter. We then compute asymptotic equivalents of the generalization performance (in terms of bias and variance) of the minimum norm least-squares fit with random projections, providing simple expressions for the double descent phenomenon.

噪聲分布 · 噪聲 · 統計量 · 泛函 · 概率密度函數 ·

2023 年 3 月 2 日

Pitfalls of Gaussians as a noise distribution in NCE

Holden Lee,Chirag Pabbaraju,Anish Sevekari,Andrej Risteski

from arxiv, 14 pages, 1 figure

Noise Contrastive Estimation (NCE) is a popular approach for learning probability density functions parameterized up to a constant of proportionality. The main idea is to design a classification problem for distinguishing training data from samples from an easy-to-sample noise distribution $q$, in a manner that avoids having to calculate a partition function. It is well-known that the choice of $q$ can severely impact the computational and statistical efficiency of NCE. In practice, a common choice for $q$ is a Gaussian which matches the mean and covariance of the data. In this paper, we show that such a choice can result in an exponentially bad (in the ambient dimension) conditioning of the Hessian of the loss, even for very simple data distributions. As a consequence, both the statistical and algorithmic complexity for such a choice of $q$ will be problematic in practice, suggesting that more complex noise distributions are essential to the success of NCE.

優化器 · 泛函 · Performer · state-of-the-art · 回合 ·

2023 年 3 月 2 日

Comparison of High-Dimensional Bayesian Optimization Algorithms on BBOB

Maria Laura Santoni,Elena Raponi,Renato De Leone,Carola Doerr

Bayesian Optimization (BO) is a class of black-box, surrogate-based heuristics that can efficiently optimize problems that are expensive to evaluate, and hence admit only small evaluation budgets. BO is particularly popular for solving numerical optimization problems in industry, where the evaluation of objective functions often relies on time-consuming simulations or physical experiments. However, many industrial problems depend on a large number of parameters. This poses a challenge for BO algorithms, whose performance is often reported to suffer when the dimension grows beyond 15 variables. Although many new algorithms have been proposed to address this problem, it is not well understood which one is the best for which optimization scenario. In this work, we compare five state-of-the-art high-dimensional BO algorithms, with vanilla BO and CMA-ES on the 24 BBOB functions of the COCO environment at increasing dimensionality, ranging from 10 to 60 variables. Our results confirm the superiority of BO over CMA-ES for limited evaluation budgets and suggest that the most promising approach to improve BO is the use of trust regions. However, we also observe significant performance differences for different function landscapes and budget exploitation phases, indicating improvement potential, e.g., through hybridization of algorithmic components.

潛變量/隱變量 · 變分自編碼 · 潛在 · 近似 · 前向 ·

2023 年 3 月 1 日

Dimension-reduced KRnet maps for high-dimensional inverse problems

Yani Feng,Kejun Tang,Xiaoliang Wan,Qifeng Liao

We present a dimension-reduced KRnet map approach (DR-KRnet) for high-dimensional inverse problems, which is based on an explicit construction of a map that pushes forward the prior measure to the posterior measure in the latent space. Our approach consists of two main components: data-driven VAE prior and density approximation of the posterior of the latent variable. In reality, it may not be trivial to initialize a prior distribution that is consistent with available prior data; in other words, the complex prior information is often beyond simple hand-crafted priors. We employ variational autoencoder (VAE) to approximate the underlying distribution of the prior dataset, which is achieved through a latent variable and a decoder. Using the decoder provided by the VAE prior, we reformulate the problem in a low-dimensional latent space. In particular, we seek an invertible transport map given by KRnet to approximate the posterior distribution of the latent variable. Moreover, an efficient physics-constrained surrogate model without any labeled data is constructed to reduce the computational cost of solving both forward and adjoint problems involved in likelihood computation. Numerical experiments are implemented to demonstrate the validity, accuracy, and efficiency of DR-KRnet.

Projection · MoDELS · 估計/估計量 · 簇 · 向量化 ·

2023 年 2 月 28 日

A Projection Approach to Local Regression with Variable-Dimension Covariates

Matthew J. Heiner,Garritt L. Page,Fernando Andrés Quintana

Incomplete covariate vectors are known to be problematic for estimation and inferences on model parameters, but their impact on prediction performance is less understood. We develop an imputation-free method that builds on a random partition model admitting variable-dimension covariates. Cluster-specific response models further incorporate covariates via linear predictors, facilitating estimation of smooth prediction surfaces with relatively few clusters. We exploit marginalization techniques of Gaussian kernels to analytically project response distributions according to any pattern of missing covariates, yielding a local regression with internally consistent uncertainty propagation that utilizes only one set of coefficients per cluster. Aggressive shrinkage of these coefficients regulates uncertainty due to missing covariates. The method allows in- and out-of-sample prediction for any missingness pattern, even if the pattern in a new subject's incomplete covariate vector was not seen in the training data. We develop an MCMC algorithm for posterior sampling that improves a computationally expensive update for latent cluster allocation. Finally, we demonstrate the model's effectiveness for nonlinear point and density prediction under various circumstances by comparing with other recent methods for regression of variable dimensions on synthetic and real data.

估計/估計量 · 噪聲 · 規范化的 · Integration · 近似 ·

2023 年 2 月 28 日

Parameter estimation for the stochastic heat equation with multiplicative noise from local measurements

Josef Janák,Markus Rei?

from arxiv, 32 pages, 3 figures

For the stochastic heat equation with multiplicative noise we consider the problem of estimating the diffusivity parameter in front of the Laplace operator. Based on local observations in space, we first study an estimator that was derived for additive noise. A stable central limit theorem shows that this estimator is consistent and asymptotically mixed normal. By taking into account the quadratic variation, we propose two new estimators. Their limiting distributions exhibit a smaller (conditional) variance and the last estimator also works for vanishing noise levels. The proofs are based on local approximation results to overcome the intricate nonlinearities and on a stable central limit theorem for stochastic integrals with respect to cylindrical Brownian motion. Simulation results illustrate the theoretical findings.

估計/估計量 · MoDELS · Performer · 方陣 · 有向 ·

2023 年 2 月 28 日

Direct Estimation of Parameters in ODE Models Using WENDy: Weak-form Estimation of Nonlinear Dynamics

David M. Bortz,Daniel A. Messenger,Vanja Dukic

from arxiv, 25 pages, 13 figures

We introduce the Weak-form Estimation of Nonlinear Dynamics (WENDy) method for estimating model parameters for non-linear systems of ODEs. The core mathematical idea involves an efficient conversion of the strong form representation of a model to its weak form, and then solving a regression problem to perform parameter inference. The core statistical idea rests on the Errors-In-Variables framework, which necessitates the use of the iteratively reweighted least squares algorithm. Further improvements are obtained by using orthonormal test functions, created from a set of $C^{\infty}$ bump functions of varying support sizes. We demonstrate that WENDy is a highly robust and efficient method for parameter inference in differential equations. Without relying on any numerical differential equation solvers, WENDy computes accurate estimates and is robust to large (biologically relevant) levels of measurement noise. For low dimensional systems with modest amounts of data, WENDy is competitive with conventional forward solver-based nonlinear least squares methods in terms of speed and accuracy. For both higher dimensional systems and stiff systems, WENDy is typically both faster (often by orders of magnitude) and more accurate than forward solver-based approaches. We illustrate the method and its performance in some common population and neuroscience models, including logistic growth, Lotka-Volterra, FitzHugh-Nagumo, Hindmarsh-Rose, and a Protein Transduction Benchmark model. Software and code for reproducing the examples is available at (//github.com/MathBioCU/WENDy).