亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Variational Bayes (VB) is a popular tool for Bayesian inference in statistical modeling. Recently, some VB algorithms are proposed to handle intractable likelihoods with applications such as approximate Bayesian computation. In this paper, we propose several unbiased estimators based on multilevel Monte Carlo (MLMC) for the gradient of Kullback-Leibler divergence between the posterior distribution and the variational distribution when the likelihood is intractable, but can be estimated unbiasedly. The new VB algorithm differs from the VB algorithms in the literature which usually render biased gradient estimators. Moreover, we incorporate randomized quasi-Monte Carlo (RQMC) sampling within the MLMC-based gradient estimators, which was known to provide a favorable rate of convergence in numerical integration. Theoretical guarantees for RQMC are provided in this new setting. Numerical experiments show that using RQMC in MLMC greatly speeds up the VB algorithm, and finds a better parameter value than some existing competitors do.

相關內容

In Bayesian accelerated life testing, the most used tool for model comparison is the deviance information criterion. An alternative and more formal approach is to use Bayes factors to compare models. However, Bayesian accelerated life testing models with more than one stressor often have mathematically intractable posterior distributions and Markov chain Monte Carlo methods are employed to obtain posterior samples to base inference on. The computation of the marginal likelihood is challenging when working with such complex models. In this paper, methods for approximating the marginal likelihood and the application thereof in the accelerated life testing paradigm are explored for dual-stress models.

Key to effective generic, or "black-box", variational inference is the selection of an approximation to the target density that balances accuracy and calibration speed. Copula models are promising options, but calibration of the approximation can be slow for some choices. Smith et al. (2020) suggest using "implicit copula" models that are formed by element-wise transformation of the target parameters. We show here why these are a tractable and scalable choice, and propose adjustments to increase their accuracy. We also show how a sub-class of elliptical copulas have a generative representation that allows easy application of the re-parameterization trick and efficient first order optimization methods. We demonstrate the estimation methodology using two statistical models as examples. The first is a mixed effects logistic regression, and the second is a regularized correlation matrix. For the latter, standard Markov chain Monte Carlo estimation methods can be slow or difficult to implement, yet our proposed variational approach provides an effective and scalable estimator. We illustrate by estimating a regularized Gaussian copula model for income inequality in U.S. states between 1917 and 2018.

Shape constraints yield flexible middle grounds between fully nonparametric and fully parametric approaches to modeling distributions of data. The specific assumption of log-concavity is motivated by applications across economics, survival modeling, and reliability theory. However, there do not currently exist valid tests for whether the underlying density of given data is log-concave. The recent universal likelihood ratio test provides a valid test. The universal test relies on maximum likelihood estimation (MLE), and efficient methods already exist for finding the log-concave MLE. This yields the first test of log-concavity that is provably valid in finite samples in any dimension, for which we also establish asymptotic consistency results. Empirically, we find that the highest power is obtained by using random projections to convert the d-dimensional testing problem into many one-dimensional problems, leading to a simple procedure that is statistically and computationally efficient.

We propose a general optimization-based framework for computing differentially private M-estimators and a new method for constructing differentially private confidence regions. Firstly, we show that robust statistics can be used in conjunction with noisy gradient descent or noisy Newton methods in order to obtain optimal private estimators with global linear or quadratic convergence, respectively. We establish local and global convergence guarantees, under both local strong convexity and self-concordance, showing that our private estimators converge with high probability to a nearly optimal neighborhood of the non-private M-estimators. Secondly, we tackle the problem of parametric inference by constructing differentially private estimators of the asymptotic variance of our private M-estimators. This naturally leads to approximate pivotal statistics for constructing confidence regions and conducting hypothesis testing. We demonstrate the effectiveness of a bias correction that leads to enhanced small-sample empirical performance in simulations. We illustrate the benefits of our methods in several numerical examples.

We introduce a novel procedure to perform Bayesian non-parametric inference with right-censored data, the \emph{beta-Stacy bootstrap}. This approximates the posterior law of summaries of the survival distribution (e.g. the mean survival time). More precisely, our procedure approximates the joint posterior law of functionals of the beta-Stacy process, a non-parametric process prior that generalizes the Dirichlet process and that is widely used in survival analysis. The beta-Stacy bootstrap generalizes and unifies other common Bayesian bootstraps for complete or censored data based on non-parametric priors. It is defined by an exact sampling algorithm that does not require tuning of Markov Chain Monte Carlo steps. We illustrate the beta-Stacy bootstrap by analyzing survival data from a real clinical trial.

Bayesian methods estimate a measure of uncertainty by using the posterior distribution. One source of difficulty in these methods is the computation of the normalizing constant. Calculating exact posterior is generally intractable and we usually approximate it. Variational Inference (VI) methods approximate the posterior with a distribution usually chosen from a simple family using optimization. The main contribution of this work is described is a set of update rules for natural gradient variational inference with mixture of Gaussians, which can be run independently for each of the mixture components, potentially in parallel.

To drive purchase in online advertising, it is of the advertiser's great interest to optimize the sequential advertising strategy whose performance and interpretability are both important. The lack of interpretability in existing deep reinforcement learning methods makes it not easy to understand, diagnose and further optimize the strategy. In this paper, we propose our Deep Intents Sequential Advertising (DISA) method to address these issues. The key part of interpretability is to understand a consumer's purchase intent which is, however, unobservable (called hidden states). In this paper, we model this intention as a latent variable and formulate the problem as a Partially Observable Markov Decision Process (POMDP) where the underlying intents are inferred based on the observable behaviors. Large-scale industrial offline and online experiments demonstrate our method's superior performance over several baselines. The inferred hidden states are analyzed, and the results prove the rationality of our inference.

We consider the exploration-exploitation trade-off in reinforcement learning and we show that an agent imbued with a risk-seeking utility function is able to explore efficiently, as measured by regret. The parameter that controls how risk-seeking the agent is can be optimized exactly, or annealed according to a schedule. We call the resulting algorithm K-learning and show that the corresponding K-values are optimistic for the expected Q-values at each state-action pair. The K-values induce a natural Boltzmann exploration policy for which the `temperature' parameter is equal to the risk-seeking parameter. This policy achieves an expected regret bound of $\tilde O(L^{3/2} \sqrt{S A T})$, where $L$ is the time horizon, $S$ is the number of states, $A$ is the number of actions, and $T$ is the total number of elapsed time-steps. This bound is only a factor of $L$ larger than the established lower bound. K-learning can be interpreted as mirror descent in the policy space, and it is similar to other well-known methods in the literature, including Q-learning, soft-Q-learning, and maximum entropy policy gradient, and is closely related to optimism and count based exploration methods. K-learning is simple to implement, as it only requires adding a bonus to the reward at each state-action and then solving a Bellman equation. We conclude with a numerical example demonstrating that K-learning is competitive with other state-of-the-art algorithms in practice.

Topic models have been widely explored as probabilistic generative models of documents. Traditional inference methods have sought closed-form derivations for updating the models, however as the expressiveness of these models grows, so does the difficulty of performing fast and accurate inference over their parameters. This paper presents alternative neural approaches to topic modelling by providing parameterisable distributions over topics which permit training by backpropagation in the framework of neural variational inference. In addition, with the help of a stick-breaking construction, we propose a recurrent network that is able to discover a notionally unbounded number of topics, analogous to Bayesian non-parametric topic models. Experimental results on the MXM Song Lyrics, 20NewsGroups and Reuters News datasets demonstrate the effectiveness and efficiency of these neural topic models.

Owing to the recent advances in "Big Data" modeling and prediction tasks, variational Bayesian estimation has gained popularity due to their ability to provide exact solutions to approximate posteriors. One key technique for approximate inference is stochastic variational inference (SVI). SVI poses variational inference as a stochastic optimization problem and solves it iteratively using noisy gradient estimates. It aims to handle massive data for predictive and classification tasks by applying complex Bayesian models that have observed as well as latent variables. This paper aims to decentralize it allowing parallel computation, secure learning and robustness benefits. We use Alternating Direction Method of Multipliers in a top-down setting to develop a distributed SVI algorithm such that independent learners running inference algorithms only require sharing the estimated model parameters instead of their private datasets. Our work extends the distributed SVI-ADMM algorithm that we first propose, to an ADMM-based networked SVI algorithm in which not only are the learners working distributively but they share information according to rules of a graph by which they form a network. This kind of work lies under the umbrella of `deep learning over networks' and we verify our algorithm for a topic-modeling problem for corpus of Wikipedia articles. We illustrate the results on latent Dirichlet allocation (LDA) topic model in large document classification, compare performance with the centralized algorithm, and use numerical experiments to corroborate the analytical results.

北京阿比特科技有限公司