亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Markov chain Monte Carlo (MCMC) is an all-purpose tool that allows one to generate dependent replicates from a posterior distribution for effectively any Bayesian hierarchical model. As such, MCMC has become a standard in Bayesian statistics. However, convergence issues, tuning, and the effective sample size of the MCMC are nontrivial considerations that are often overlooked or can be difficult to assess. Moreover, these practical issues can produce a significant computational burden. This motivates us to consider finding closed-form expressions of the posterior distribution that are computationally straightforward to sample from directly. We focus on a broad class of Bayesian generalized linear mixed-effects models (GLMM) that allows one to jointly model data of different types (e.g., Gaussian, Poisson, and binomial distributed observations). Exact sampling from the posterior distribution for Bayesian GLMMs is such a difficult problem that it is now arguably overlooked as a possible problem to solve. To solve this problem, we derive a new class of distributions that gives one the flexibility to specify the prior on fixed and random effects to be any conjugate multivariate distribution. We refer to this new distribution as the generalized conjugate multivariate (GCM) distribution, and several technical results are provided. The expression of the exact posterior distribution is given along with the steps to obtain direct independent simulations from the posterior distribution. These direct simulations have an efficient projection/regression form, and hence, we refer to our method as Exact Posterior Regression (EPR). Several theoretical results are developed that create the foundation for EPR. Illustrative examples are provided including a simulation study and an analysis of estimates from the U.S. Census Bureau's American Community Survey (ACS).

相關內容

Artificial Neural Networks (ANNs) can be viewed as nonlinear sieves that can approximate complex functions of high dimensional variables more effectively than linear sieves. We investigate the performance of various ANNs in nonparametric instrumental variables (NPIV) models of moderately high dimensional covariates that are relevant to empirical economics. We present two efficient procedures for estimation and inference on a weighted average derivative (WAD): an orthogonalized plug-in with optimally-weighted sieve minimum distance (OP-OSMD) procedure and a sieve efficient score (ES) procedure. Both estimators for WAD use ANN sieves to approximate the unknown NPIV function and are root-n asymptotically normal and first-order equivalent. We provide a detailed practitioner's recipe for implementing both efficient procedures. We compare their finite-sample performances in various simulation designs that involve smooth NPIV function of up to 13 continuous covariates, different nonlinearities and covariate correlations. Some Monte Carlo findings include: 1) tuning and optimization are more delicate in ANN estimation; 2) given proper tuning, both ANN estimators with various architectures can perform well; 3) easier to tune ANN OP-OSMD estimators than ANN ES estimators; 4) stable inferences are more difficult to achieve with ANN (than spline) estimators; 5) there are gaps between current implementations and approximation theories. Finally, we apply ANN NPIV to estimate average partial derivatives in two empirical demand examples with multivariate covariates.

The conditional moment problem is a powerful formulation for describing structural causal parameters in terms of observables, a prominent example being instrumental variable regression. A standard approach reduces the problem to a finite set of marginal moment conditions and applies the optimally weighted generalized method of moments (OWGMM), but this requires we know a finite set of identifying moments, can still be inefficient even if identifying, or can be theoretically efficient but practically unwieldy if we use a growing sieve of moment conditions. Motivated by a variational minimax reformulation of OWGMM, we define a very general class of estimators for the conditional moment problem, which we term the variational method of moments (VMM) and which naturally enables controlling infinitely-many moments. We provide a detailed theoretical analysis of multiple VMM estimators, including ones based on kernel methods and neural nets, and provide conditions under which these are consistent, asymptotically normal, and semiparametrically efficient in the full conditional moment model. We additionally provide algorithms for valid statistical inference based on the same kind of variational reformulations, both for kernel- and neural-net-based varieties. Finally, we demonstrate the strong performance of our proposed estimation and inference algorithms in a detailed series of synthetic experiments.

Accurate power and sample size estimation are crucial to the design and analysis of genetic association studies. When analyzing a binary trait via logistic regression, important covariates such as age and sex are typically included in the model. However, their effects are rarely properly considered in power or sample size computation during study planning. Unlike when analyzing a continuous trait, the power of association testing between a binary trait and a genetic variant depends, explicitly, on covariate effects, even under the assumption of gene-environment independence. Earlier work recognizes this hidden factor but implemented methods are not flexible. We thus propose and implement a generalized method for estimating power and sample size for (discovery or replication) association studies of binary traits that a) accommodates different types of non-genetic covariates E, b) deals with different types of G-E relationships, and c) is computationally efficient. Extensive simulation studies show that the proposed method is accurate and computationally efficient for both prospective and retrospective sampling designs with various covariate structures. A proof-of-principle application focused on the understudied African sample in the UK Biobank data. Results show that, in contrast to studying the continuous blood pressure trait, when analyzing the binary hypertension trait ignoring covariate effects of age and sex leads to overestimated power and underestimated replication sample size.

Methods such as low-rank approximations, covariance tapering and composite likelihoods are useful for speeding up inference in settings where the true likelihood is unknown or intractable to compute. However, such methods can lead to considerable model misspecification, which is particularly challenging to deal with and can provide misleading results when performing Bayesian inference. Adjustment methods for improving properties of posteriors based on composite and/or misspecified likelihoods have been developed, but these cannot be used in conjunction with certain computationally efficient inference software, such as the \texttt{R-INLA} package. We develop a novel adjustment method aimed at postprocessing posterior distributions to improve the properties of their credible intervals. The adjustment method can be used in conjunction with any software for Bayesian inference that allows for sampling from the posterior. Several simulation studies demonstrate that the adjustment method performs well in a variety of different settings. The method is also applied for performing improved modelling of extreme hourly precipitation from high-resolution radar data in Norway, using the spatial conditional extremes model and \texttt{R-INLA}. The results show a clear improvement in performance for all model fits after applying the posterior adjustment method.

In statistical applications, it is common to encounter parameters supported on a varying or unknown dimensional space. Examples include the fused lasso regression, the matrix recovery under an unknown low rank, etc. Despite the ease of obtaining a point estimate via the optimization, it is much more challenging to quantify their uncertainty -- in the Bayesian framework, a major difficulty is that if assigning the prior associated with a $p$-dimensional measure, then there is zero posterior probability on any lower-dimensional subset with dimension $d<p$; to avoid this caveat, one needs to choose another dimension-selection prior on $d$, which often involves a highly combinatorial problem. To significantly reduce the modeling burden, we propose a new generative process for the prior: starting from a continuous random variable such as multivariate Gaussian, we transform it into a varying-dimensional space using the proximal mapping. This leads to a large class of new Bayesian models that can directly exploit the popular frequentist regularizations and their algorithms, such as the nuclear norm penalty and the alternating direction method of multipliers, while providing a principled and probabilistic uncertainty estimation. We show that this framework is well justified in the geometric measure theory, and enjoys a convenient posterior computation via the standard Hamiltonian Monte Carlo. We demonstrate its use in the analysis of the dynamic flow network data.

The behavior of many Bayesian models used in machine learning critically depends on the choice of prior distributions, controlled by some hyperparameters that are typically selected by Bayesian optimization or cross-validation. This requires repeated, costly, posterior inference. We provide an alternative for selecting good priors without carrying out posterior inference, building on the prior predictive distribution that marginalizes out the model parameters. We estimate virtual statistics for data generated by the prior predictive distribution and then optimize over the hyperparameters to learn ones for which these virtual statistics match target values provided by the user or estimated from (subset of) the observed data. We apply the principle for probabilistic matrix factorization, for which good solutions for prior selection have been missing. We show that for Poisson factorization models we can analytically determine the hyperparameters, including the number of factors, that best replicate the target statistics, and we study empirically the sensitivity of the approach for model mismatch. We also present a model-independent procedure that determines the hyperparameters for general models by stochastic optimization, and demonstrate this extension in context of hierarchical matrix factorization models.

To validate the safety of automated vehicles (AV), scenario-based testing aims to systematically describe driving scenarios an AV might encounter. In this process, continuous inputs such as velocities result in an infinite number of possible variations of a scenario. Thus, metamodels are used to perform analyses or to select specific variations for examination. However, despite the safety criticality of AV testing, metamodels are usually seen as a part of an overall approach, and their predictions are not questioned. This paper analyzes the predictive performance of Gaussian processes (GP), deep Gaussian processes, extra-trees, and Bayesian neural networks (BNN), considering four scenarios with 5 to 20 inputs. Building on this, an iterative approach is introduced and evaluated, which allows to efficiently select test cases for common analysis tasks. The results show that regarding predictive performance, the appropriate selection of test cases is more important than the choice of metamodels. However, the choice of metamodels remains crucial: Their great flexibility allows BNNs to benefit from large amounts of data and to model even the most complex scenarios. In contrast, less flexible models like GPs convince with higher reliability. Hence, relevant test cases are best explored using scalable virtual test setups and flexible models. Subsequently, more realistic test setups and more reliable models can be used for targeted testing and validation.

To provide rigorous uncertainty quantification for online learning models, we develop a framework for constructing uncertainty sets that provably control risk -- such as coverage of confidence intervals, false negative rate, or F1 score -- in the online setting. This extends conformal prediction to apply to a larger class of online learning problems. Our method guarantees risk control at any user-specified level even when the underlying data distribution shifts drastically, even adversarially, over time in an unknown fashion. The technique we propose is highly flexible as it can be applied with any base online learning algorithm (e.g., a deep neural network trained online), requiring minimal implementation effort and essentially zero additional computational cost. We further extend our approach to control multiple risks simultaneously, so the prediction sets we generate are valid for all given risks. To demonstrate the utility of our method, we conduct experiments on real-world tabular time-series data sets showing that the proposed method rigorously controls various natural risks. Furthermore, we show how to construct valid intervals for an online image-depth estimation problem that previous sequential calibration schemes cannot handle.

Under the reproducing kernel Hilbert spaces (RKHS), we consider the penalized least-squares of the partially functional linear models (PFLM), whose predictor contains both functional and traditional multivariate parts, and the multivariate part allows a divergent number of parameters. From the non-asymptotic point of view, we focus on the rate-optimal upper and lower bounds of the prediction error. An exact upper bound for the excess prediction risk is shown in a non-asymptotic form under a more general assumption known as the effective dimension to the model, by which we also show the prediction consistency when the number of multivariate covariates $p$ slightly increases with the sample size $n$. Our new finding implies a trade-off between the number of non-functional predictors and the effective dimension of the kernel principal components to ensure prediction consistency in the increasing-dimensional setting. The analysis in our proof hinges on the spectral condition of the sandwich operator of the covariance operator and the reproducing kernel, and on sub-Gaussian and Berstein concentration inequalities for the random elements in Hilbert space. Finally, we derive the non-asymptotic minimax lower bound under the regularity assumption of the Kullback-Leibler divergence of the models.

Since hardware resources are limited, the objective of training deep learning models is typically to maximize accuracy subject to the time and memory constraints of training and inference. We study the impact of model size in this setting, focusing on Transformer models for NLP tasks that are limited by compute: self-supervised pretraining and high-resource machine translation. We first show that even though smaller Transformer models execute faster per iteration, wider and deeper models converge in significantly fewer steps. Moreover, this acceleration in convergence typically outpaces the additional computational overhead of using larger models. Therefore, the most compute-efficient training strategy is to counterintuitively train extremely large models but stop after a small number of iterations. This leads to an apparent trade-off between the training efficiency of large Transformer models and the inference efficiency of small Transformer models. However, we show that large models are more robust to compression techniques such as quantization and pruning than small models. Consequently, one can get the best of both worlds: heavily compressed, large models achieve higher accuracy than lightly compressed, small models.

北京阿比特科技有限公司