亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

The prior distribution on parameters of a sampling distribution is the usual starting point for Bayesian uncertainty quantification. In this paper, we present a different perspective which focuses on missing observations as the source of statistical uncertainty, with the parameter of interest being known precisely given the entire population. We argue that the foundation of Bayesian inference is to assign a distribution on missing observations conditional on what has been observed. In the conditionally i.i.d. setting with an observed sample of size $n$, the Bayesian would thus assign a predictive distribution on the missing $Y_{n+1:\infty}$ conditional on $Y_{1:n}$, which then induces a distribution on the parameter. Demonstrating an application of martingales, Doob shows that choosing the Bayesian predictive distribution returns the conventional posterior as the distribution of the parameter. Taking this as our cue, we relax the predictive machine, avoiding the need for the predictive to be derived solely from the usual prior to posterior to predictive density formula. We introduce the \textit{martingale posterior distribution}, which returns Bayesian uncertainty directly on any statistic of interest without the need for the likelihood and prior, and this distribution can be sampled through a computational scheme we name \textit{predictive resampling}. To that end, we introduce new predictive methodologies for multivariate density estimation, regression and classification that build upon recent work on bivariate copulas.

相關內容

Classification approaches based on the direct estimation and analysis of posterior probabilities will degrade if the original class priors begin to change. We prove that a unique (up to scale) solution is possible to recover the data likelihoods for a test example from its original class posteriors and dataset priors. Given the recovered likelihoods and a set of new priors, the posteriors can be re-computed using Bayes' Rule to reflect the influence of the new priors. The method is simple to compute and allows a dynamic update of the original posteriors.

This work studies anomaly detection under differential privacy with Gaussian perturbation using both statistical and information-theoretic tools. In our setting, the adversary aims to modify the differentially private information of a statistical dataset by inserting additional data without being detected by using the differential privacy to her/his own benefit. To this end, firstly via hypothesis testing, we characterize a statistical threshold for the adversary, which balances the privacy budget and the induced bias (the impact of the attack) in order to remain undetected. In addition, we establish the privacy-distortion tradeoff in the sense of the well-known rate-distortion function for the Gaussian mechanism by using an information-theoretic approach and present an upper bound on the variance of the attacker's additional data as a function of the sensitivity and the original data's second-order statistics. Lastly, we introduce a new privacy metric based on Chernoff information for classifying adversaries under differential privacy as a stronger alternative for the Gaussian mechanism. Analytical results are supported by numerical evaluations.

We seek an entropy estimator for discrete distributions with fully empirical accuracy bounds. As stated, this goal is infeasible without some prior assumptions on the distribution. We discover that a certain information moment assumption renders the problem feasible. We argue that the moment assumption is natural and, in some sense, {\em minimalistic} -- weaker than finite support or tail decay conditions. Under the moment assumption, we provide the first finite-sample entropy estimates for infinite alphabets, nearly recovering the known minimax rates. Moreover, we demonstrate that our empirical bounds are significantly sharper than the state-of-the-art bounds, for various natural distributions and non-trivial sample regimes. Along the way, we give a dimension-free analogue of the Cover-Thomas result on entropy continuity (with respect to total variation distance) for finite alphabets, which may be of independent interest. Additionally, we resolve all of the open problems posed by J\"urgensen and Matthews, 2010.

A well known identifiability issue in factor analytic models is the invariance with respect to orthogonal transformations. This problem burdens the inference under a Bayesian setup, where Markov chain Monte Carlo (MCMC) methods are used to generate samples from the posterior distribution. We introduce a post-processing scheme in order to deal with rotation, sign and permutation invariance of the MCMC sample. The exact version of the contributed algorithm requires to solve $2^q$ assignment problems per (retained) MCMC iteration, where $q$ denotes the number of factors of the fitted model. For large numbers of factors two approximate schemes based on simulated annealing are also discussed. We demonstrate that the proposed method leads to interpretable posterior distributions using synthetic and publicly available data from typical factor analytic models as well as mixtures of factor analyzers. An R package is available online at CRAN web-page.

The hierarchical and recursive expressive capability of rooted trees is applicable to represent statistical models in various areas, such as data compression, image processing, and machine learning. On the other hand, such hierarchical expressive capability causes a problem in tree selection to avoid overfitting. One unified approach to solve this is a Bayesian approach, on which the rooted tree is regarded as a random variable and a direct loss function can be assumed on the selected model or the predicted value for a new data point. However, all the previous studies on this approach are based on the probability distribution on full trees, to the best of our knowledge. In this paper, we propose a generalized probability distribution for any rooted trees in which only the maximum number of child nodes and the maximum depth are fixed. Furthermore, we derive recursive methods to evaluate the characteristics of the probability distribution without any approximations.

The recursive and hierarchical structure of full rooted trees is applicable to represent statistical models in various areas, such as data compression, image processing, and machine learning. In most of these cases, the full rooted tree is not a random variable; as such, model selection to avoid overfitting becomes problematic. A method to solve this problem is to assume a prior distribution on the full rooted trees. This enables the optimal model selection based on the Bayes decision theory. For example, by assigning a low prior probability to a complex model, the maximum a posteriori estimator prevents the selection of the complex one. Furthermore, we can average all the models weighted by their posteriors. In this paper, we propose a probability distribution on a set of full rooted trees. Its parametric representation is suitable for calculating the properties of our distribution using recursive functions, such as the mode, expectation, and posterior distribution. Although such distributions have been proposed in previous studies, they are only applicable to specific applications. Therefore, we extract their mathematically essential components and derive new generalized methods to calculate the expectation, posterior distribution, etc.

There is a need for new models for characterizing dependence in multivariate data. The multivariate Gaussian distribution is routinely used, but cannot characterize nonlinear relationships in the data. Most non-linear extensions tend to be highly complex; for example, involving estimation of a non-linear regression model in latent variables. In this article, we propose a relatively simple class of Ellipsoid-Gaussian multivariate distributions, which are derived by using a Gaussian linear factor model involving latent variables having a von Mises-Fisher distribution on a unit hyper-sphere. We show that the Ellipsoid-Gaussian distribution can flexibly model curved relationships among variables with lower-dimensional structures. Taking a Bayesian approach, we propose a hybrid of gradient-based geodesic Monte Carlo and adaptive Metropolis for posterior sampling. We derive basic properties and illustrate the utility of the Ellipsoid-Gaussian distribution on a variety of simulated and real data applications.

The Variational Auto-Encoder (VAE) is one of the most used unsupervised machine learning models. But although the default choice of a Gaussian distribution for both the prior and posterior represents a mathematically convenient distribution often leading to competitive results, we show that this parameterization fails to model data with a latent hyperspherical structure. To address this issue we propose using a von Mises-Fisher (vMF) distribution instead, leading to a hyperspherical latent space. Through a series of experiments we show how such a hyperspherical VAE, or $\mathcal{S}$-VAE, is more suitable for capturing data with a hyperspherical latent structure, while outperforming a normal, $\mathcal{N}$-VAE, in low dimensions on other data types.

Methods that align distributions by minimizing an adversarial distance between them have recently achieved impressive results. However, these approaches are difficult to optimize with gradient descent and they often do not converge well without careful hyperparameter tuning and proper initialization. We investigate whether turning the adversarial min-max problem into an optimization problem by replacing the maximization part with its dual improves the quality of the resulting alignment and explore its connections to Maximum Mean Discrepancy. Our empirical results suggest that using the dual formulation for the restricted family of linear discriminators results in a more stable convergence to a desirable solution when compared with the performance of a primal min-max GAN-like objective and an MMD objective under the same restrictions. We test our hypothesis on the problem of aligning two synthetic point clouds on a plane and on a real-image domain adaptation problem on digits. In both cases, the dual formulation yields an iterative procedure that gives more stable and monotonic improvement over time.

Discrete random structures are important tools in Bayesian nonparametrics and the resulting models have proven effective in density estimation, clustering, topic modeling and prediction, among others. In this paper, we consider nested processes and study the dependence structures they induce. Dependence ranges between homogeneity, corresponding to full exchangeability, and maximum heterogeneity, corresponding to (unconditional) independence across samples. The popular nested Dirichlet process is shown to degenerate to the fully exchangeable case when there are ties across samples at the observed or latent level. To overcome this drawback, inherent to nesting general discrete random measures, we introduce a novel class of latent nested processes. These are obtained by adding common and group-specific completely random measures and, then, normalising to yield dependent random probability measures. We provide results on the partition distributions induced by latent nested processes, and develop an Markov Chain Monte Carlo sampler for Bayesian inferences. A test for distributional homogeneity across groups is obtained as a by product. The results and their inferential implications are showcased on synthetic and real data.

北京阿比特科技有限公司