精品夜色国产国偷自产乱码-黄色网站欧美黄色大片

We study a class of weakly identifiable location-scale mixture models for which the maximum likelihood estimates based on $n$ i.i.d. samples are known to have lower accuracy than the classical $n^{- \frac{1}{2}}$ error. We investigate whether the Expectation-Maximization (EM) algorithm also converges slowly for these models. We provide a rigorous characterization of EM for fitting a weakly identifiable Gaussian mixture in a univariate setting where we prove that the EM algorithm converges in order $n^{\frac{3}{4}}$ steps and returns estimates that are at a Euclidean distance of order ${ n^{- \frac{1}{8}}}$ and ${ n^{-\frac{1} {4}}}$ from the true location and scale parameter respectively. Establishing the slow rates in the univariate setting requires a novel localization argument with two stages, with each stage involving an epoch-based argument applied to a different surrogate EM operator at the population level. We demonstrate several multivariate ($d \geq 2$) examples that exhibit the same slow rates as the univariate case. We also prove slow statistical rates in higher dimensions in a special case, when the fitted covariance is constrained to be a multiple of the identity.

相關內容

可(ke)辨認的

關注 4

統計量 · 估計/估計量 · Processing（編程語言） · MoDELS · 平滑 ·

2022 年 1 月 19 日

Error analysis for a statistical finite element method

Toni Karvonen,Fehmi Cirak,Mark Girolami

The recently proposed statistical finite element (statFEM) approach synthesises measurement data with finite element models and allows for making predictions about the true system response. We provide a probabilistic error analysis for a prototypical statFEM setup based on a Gaussian process prior under the assumption that the noisy measurement data are generated by a deterministic true system response function that satisfies a second-order elliptic partial differential equation for an unknown true source term. In certain cases, properties such as the smoothness of the source term may be misspecified by the Gaussian process model. The error estimates we derive are for the expectation with respect to the measurement noise of the $L^2$-norm of the difference between the true system response and the mean of the statFEM posterior. The estimates imply polynomial rates of convergence in the numbers of measurement points and finite element basis functions and depend on the Sobolev smoothness of the true source term and the Gaussian process model. A numerical example for Poisson's equation is used to illustrate these theoretical results.

INFORMS · 可約的 · 同分布的 · Performer · 相互獨立的 ·

2022 年 1 月 18 日

Improved Receivers for Optical Wireless OFDM: An Information Theoretic Perspective

Xiaozhen Liu,Jing Zhou,Nuo Huang,Wenyi Zhang

from arxiv, 15 pages, 17 figures. This paper was presented in part at Workshop on Optical Wireless Communication (OWC) in VTC 2021 Fall

We consider performance enhancement of asymmetrically-clipped optical orthogonal frequency division multiplexing (ACO-OFDM) and related optical OFDM schemes, which are variations of OFDM in intensity-modulated optical wireless communications. Unlike most existing studies on specific designs of improved receivers, this paper investigates information theoretic limits of all possible receivers. For independent and identically distributed complex Gaussian inputs, we obtain an exact characterization of information rate of ACO-OFDM with improved receivers for all SNRs. It is proved that the high-SNR gain of improved receivers asymptotically achieve 1/4 bits per channel use, which is equivalent to 3 dB in electrical SNR or 1.5 dB in optical SNR; as the SNR decreases, the maximum achievable SNR gain of improved receivers decreases monotonically to a non-zero low-SNR limit, corresponding to an information rate gain of 36.3%. For practically used constellations, we derive an upper bound on the gain of improved receivers. Numerical results demonstrate that the upper bound can be approached to within 1 dB in optical SNR by combining existing improved receivers and coded modulation. We also show that our information theoretic analyses can be extended to Flip-OFDM and PAM-DMT. Our results imply that, for the considered schemes, improved receivers may reduce the gap to channel capacity significantly at low-to-moderate SNR.

均方誤差 · 估計/估計量 · 極小點 · 方陣 · 線性的 ·

2022 年 1 月 18 日

Well-Conditioned Linear Minimum Mean Square Error Estimation

Edwin K. P. Chong

Linear minimum mean square error (LMMSE) estimation is often ill-conditioned, suggesting that unconstrained minimization of the mean square error is an inadequate principle for filter design. To address this, we first develop a unifying framework for studying constrained LMMSE estimation problems. Using this framework, we expose an important structural property of constrained LMMSE filters: They generally involve an inherent preconditioning step. This parameterizes all such filters only by their preconditioners. Moreover, each filters is invariant to invertible linear transformations of its preconditioner. We then clarify that merely constraining the rank of the filter does not suitably address the problem of ill-conditioning. Instead, we adopt a constraint that explicitly requires solutions to be well-conditioned in a certain specific sense. We introduce two well-conditioned filters and show that they converge to the unconstrained LMMSE filter as their truncated-power loss goes to zero, at the same rate as the low-rank Wiener filter. We also show extensions to the case of weighted trace and determinant of the error covariance as objective functions. Finally, we show quantitative results with historical VIX data to demonstrate that our two well-conditioned filters have stable performance while the standard LMMSE filter deteriorates with increasing condition number.

估計/估計量 · 可辨認的 · 似然 · MoDELS · 穩健性 ·

2022 年 1 月 17 日

Decoupling Trends and Changepoint Analysis

Haoxuan Wu,Sean Ryan,David S. Matteson

This paper introduces a new Bayesian changepoint approach called the decoupled approach that separates the process of modeling and changepoint analysis. The approach utilizes a Bayesian dynamic linear model (DLM) for the modeling step and a weighted penalized likelihood estimator on the posterior of the Bayesian DLM to identify changepoints. A Bayesian DLM, with shrinkage priors, can provide smooth estimates of the underlying trend in presence of complex noise components; however, the inability to shrink exactly to zero make changepoint analysis difficult. Penalized likelihood estimators can be effective in estimating location of changepoints; however, they require a relatively smooth estimate of the data. The decoupled approach combines the flexibility of the Bayesian DLM along with the hard thresholding property of penalized likelihood estimator to extend application of changepoint analysis. The approach provides a robust framework that allows for identification of changepoints in highly complex Bayesian models. The approach can identify changes in mean, higher order trends and regression coefficients. We illustrate the approach's flexibility and robustness by comparing against several alternative methods in a wide range of simulations and two real world examples.

圖 · 近似 · 核化 · 離散化 · Weight ·

2022 年 1 月 17 日

Limits and consistency of non-local and graph approximations to the Eikonal equation

Jalal Fadili,Nicolas Forcadel,Thi Tuyen Nguyen,Rita Zantout

In this paper, we study a non-local approximation of the time-dependent (local) Eikonal equation with Dirichlet-type boundary conditions, where the kernel in the non-local problem is properly scaled. Based on the theory of viscosity solutions, we prove existence and uniqueness of the viscosity solutions of both the local and non-local problems, as well as regularity properties of these solutions in time and space. We then derive error bounds between the solution to the non-local problem and that of the local one, both in continuous-time and Backward Euler time discretization. We then turn to studying continuum limits of non-local problems defined on random weighted graphs with $n$ vertices. In particular, we establish that if the kernel scale parameter decreases at an appropriate rate as $n$ grows, then almost surely, the solution of the problem on graphs converges uniformly to the viscosity solution of the local problem as the time step vanishes and the number vertices $n$ grows large.

風險函數 · 優化器 · CASE · 泛函 · 損失函數（機器學習） ·

2022 年 1 月 16 日

Faster Rates of Private Stochastic Convex Optimization

Jinyan Su,Lijie Hu,Di Wang

from arxiv, To appear in The 33rd International Conference on Algorithmic Learning Theory. In this version, we fixed some typos and correct the prove of lower bound

In this paper, we revisit the problem of Differentially Private Stochastic Convex Optimization (DP-SCO) and provide excess population risks for some special classes of functions that are faster than the previous results of general convex and strongly convex functions. In the first part of the paper, we study the case where the population risk function satisfies the Tysbakov Noise Condition (TNC) with some parameter $\theta>1$. Specifically, we first show that under some mild assumptions on the loss functions, there is an algorithm whose output could achieve an upper bound of $\tilde{O}((\frac{1}{\sqrt{n}}+\frac{\sqrt{d\log \frac{1}{\delta}}}{n\epsilon})^\frac{\theta}{\theta-1})$ for $(\epsilon, \delta)$-DP when $\theta\geq 2$, here $n$ is the sample size and $d$ is the dimension of the space. Then we address the inefficiency issue, improve the upper bounds by $\text{Poly}(\log n)$ factors and extend to the case where $\theta\geq \bar{\theta}>1$ for some known $\bar{\theta}$. Next we show that the excess population risk of population functions satisfying TNC with parameter $\theta\geq 2$ is always lower bounded by $\Omega((\frac{d}{n\epsilon})^\frac{\theta}{\theta-1}) $ and $\Omega((\frac{\sqrt{d\log \frac{1}{\delta}}}{n\epsilon})^\frac{\theta}{\theta-1})$ for $\epsilon$-DP and $(\epsilon, \delta)$-DP, respectively. In the second part, we focus on a special case where the population risk function is strongly convex. Unlike the previous studies, here we assume the loss function is {\em non-negative} and {\em the optimal value of population risk is sufficiently small}. With these additional assumptions, we propose a new method whose output could achieve an upper bound of $O(\frac{d\log\frac{1}{\delta}}{n^2\epsilon^2}+\frac{1}{n^{\tau}})$ for any $\tau\geq 1$ in $(\epsilon,\delta)$-DP model if the sample size $n$ is sufficiently large.

統計量 · 期望極大算法 · MoDELS · 估計/估計量 · 高斯混合（模型） ·

2022 年 1 月 16 日

Differentially Private (Gradient) Expectation Maximization Algorithm with Statistical Guarantees

Di Wang,Jiahao Ding,Lijie Hu,Zejun Xie,Miao Pan,Jinhui Xu

from arxiv, Submiited. arXiv admin note: text overlap with arXiv:2010.09576

(Gradient) Expectation Maximization (EM) is a widely used algorithm for estimating the maximum likelihood of mixture models or incomplete data problems. A major challenge facing this popular technique is how to effectively preserve the privacy of sensitive data. Previous research on this problem has already lead to the discovery of some Differentially Private (DP) algorithms for (Gradient) EM. However, unlike in the non-private case, existing techniques are not yet able to provide finite sample statistical guarantees. To address this issue, we propose in this paper the first DP version of (Gradient) EM algorithm with statistical guarantees. Moreover, we apply our general framework to three canonical models: Gaussian Mixture Model (GMM), Mixture of Regressions Model (MRM) and Linear Regression with Missing Covariates (RMC). Specifically, for GMM in the DP model, our estimation error is near optimal in some cases. For the other two models, we provide the first finite sample statistical guarantees. Our theory is supported by thorough numerical experiments.

圖 · 均值 · INFORMS · 情景 · 無向 ·

2022 年 1 月 15 日

Theoretical analysis and computation of the sample Frechet mean for sets of large graphs based on spectral information

Daniel Ferguson,Francois G. Meyer

from arxiv, arXiv admin note: text overlap with arXiv:2105.04062

To characterize the location (mean, median) of a set of graphs, one needs a notion of centrality that is adapted to metric spaces, since graph sets are not Euclidean spaces. A standard approach is to consider the Frechet mean. In this work, we equip a set of graphs with the pseudometric defined by the norm between the eigenvalues of their respective adjacency matrix. Unlike the edit distance, this pseudometric reveals structural changes at multiple scales, and is well adapted to studying various statistical problems for graph-valued data. We describe an algorithm to compute an approximation to the sample Frechet mean of a set of undirected unweighted graphs with a fixed size using this pseudometric.

環 · MAML · 優化器 · 學成 · 小樣本學習 ·

2019 年 9 月 10 日

Meta-Learning with Implicit Gradients

Aravind Rajeswaran,Chelsea Finn,Sham Kakade,Sergey Levine

from arxiv, NeurIPS 2019. First two authors contributed equally

A core capability of intelligent systems is the ability to quickly learn new tasks by drawing on prior experience. Gradient (or optimization) based meta-learning has recently emerged as an effective approach for few-shot learning. In this formulation, meta-parameters are learned in the outer loop, while task-specific models are learned in the inner-loop, by using only a small amount of data from the current task. A key challenge in scaling these approaches is the need to differentiate through the inner loop learning process, which can impose considerable computational and memory burdens. By drawing upon implicit differentiation, we develop the implicit MAML algorithm, which depends only on the solution to the inner level optimization and not the path taken by the inner loop optimizer. This effectively decouples the meta-gradient computation from the choice of inner loop optimizer. As a result, our approach is agnostic to the choice of inner loop optimizer and can gracefully handle many gradient steps without vanishing gradients or memory constraints. Theoretically, we prove that implicit MAML can compute accurate meta-gradients with a memory footprint that is, up to small constant factors, no more than that which is required to compute a single inner loop gradient and at no overall increase in the total computational cost. Experimentally, we show that these benefits of implicit MAML translate into empirical gains on few-shot image recognition benchmarks.

似然 · 估計/估計量 · 最大似然估計 · 極大似然 · MoDELS ·

2018 年 9 月 24 日

Implicit Maximum Likelihood Estimation

Ke Li,Jitendra Malik

from arxiv, 21 pages, 4 figures. In the interest of promoting discussion, we make the reviews available at //people.eecs.berkeley.edu/~ke.li/papers/imle_reviews.pdf

Implicit probabilistic models are models defined naturally in terms of a sampling procedure and often induces a likelihood function that cannot be expressed explicitly. We develop a simple method for estimating parameters in implicit models that does not require knowledge of the form of the likelihood function or any derived quantities, but can be shown to be equivalent to maximizing likelihood under some conditions. Our result holds in the non-asymptotic parametric setting, where both the capacity of the model and the number of data examples are finite. We also demonstrate encouraging experimental results.