亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Discrete data are abundant and often arise as counts or rounded data. However, even for linear regression models, conjugate priors and closed-form posteriors are typically unavailable, thereby necessitating approximations or Markov chain Monte Carlo for posterior inference. For a broad class of count and rounded data regression models, we introduce conjugate priors that enable closed-form posterior inference. Key posterior and predictive functionals are computable analytically or via direct Monte Carlo simulation. Crucially, the predictive distributions are discrete to match the support of the data and can be evaluated or simulated jointly across multiple covariate values. These tools are broadly useful for linear regression, nonlinear models via basis expansions, and model and variable selection. Multiple simulation studies demonstrate significant advantages in computing, predictive modeling, and selection relative to existing alternatives.

相關內容

Large-scale datasets are important for the development of deep learning models. Such datasets usually require a heavy workload of annotations, which are extremely time-consuming and expensive. To accelerate the annotation procedure, multiple annotators may be employed to label different subsets of the data. However, the inconsistency and bias among different annotators are harmful to the model training, especially for qualitative and subjective tasks.To address this challenge, in this paper, we propose a novel contrastive regression framework to address the disjoint annotations problem, where each sample is labeled by only one annotator and multiple annotators work on disjoint subsets of the data. To take account of both the intra-annotator consistency and inter-annotator inconsistency, two strategies are employed.Firstly, a contrastive-based loss is applied to learn the relative ranking among different samples of the same annotator, with the assumption that the ranking of samples from the same annotator is unanimous. Secondly, we apply the gradient reversal layer to learn robust representations that are invariant to different annotators. Experiments on the facial expression prediction task, as well as the image quality assessment task, verify the effectiveness of our proposed framework.

We consider a high-dimensional model in which variables are observed over time and space. The model consists of a spatio-temporal regression containing a time lag and a spatial lag of the dependent variable. Unlike classical spatial autoregressive models, we do not rely on a predetermined spatial interaction matrix, but infer all spatial interactions from the data. Assuming sparsity, we estimate the spatial and temporal dependence fully data-driven by penalizing a set of Yule-Walker equations. This regularization can be left unstructured, but we also propose customized shrinkage procedures when observations originate from spatial grids (e.g. satellite images). Finite sample error bounds are derived and estimation consistency is established in an asymptotic framework wherein the sample size and the number of spatial units diverge jointly. Exogenous variables can be included as well. A simulation exercise shows strong finite sample performance compared to competing procedures. As an empirical application, we model satellite measured NO2 concentrations in London. Our approach delivers forecast improvements over a competitive benchmark and we discover evidence for strong spatial interactions.

This article extends the widely-used synthetic controls estimator for evaluating causal effects of policy changes to quantile functions. The proposed method provides a geometrically faithful estimate of the entire counterfactual quantile function of the treated unit. Its appeal stems from an efficient implementation via a constrained quantile-on-quantile regression. This constitutes a novel concept of independent interest. The method provides a unique counterfactual quantile function in any scenario: for continuous, discrete or mixed distributions. It operates in both repeated cross-sections and panel data with as little as a single pre-treatment period. The article also provides abstract identification results by showing that any synthetic controls method, classical or our generalization, provides the correct counterfactual for causal models that preserve distances between the outcome distributions. Working with whole quantile functions instead of aggregate values allows for tests of equality and stochastic dominance of the counterfactual- and the observed distribution. It can provide causal inference on standard outcomes like average- or quantile treatment effects, but also more general concepts such as counterfactual Lorenz curves or interquartile ranges.

We introduce a nonparametric graphical model for discrete node variables based on additive conditional independence. Additive conditional independence is a three way statistical relation that shares similar properties with conditional independence by satisfying the semi-graphoid axioms. Based on this relation we build an additive graphical model for discrete variables that does not suffer from the restriction of a parametric model such as the Ising model. We develop an estimator of the new graphical model via the penalized estimation of the discrete version of the additive precision operator and establish the consistency of the estimator under the ultrahigh-dimensional setting. Along with these methodological developments, we also exploit the properties of discrete random variables to uncover a deeper relation between additive conditional independence and conditional independence than previously known. The new graphical model reduces to a conditional independence graphical model under certain sparsity conditions. We conduct simulation experiments and analysis of an HIV antiretroviral therapy data set to compare the new method with existing ones.

The cost of both generalized least squares (GLS) and Gibbs sampling in a crossed random effects model can easily grow faster than $N^{3/2}$ for $N$ observations. Ghosh et al. (2020) develop a backfitting algorithm that reduces the cost to $O(N)$. Here we extend that method to a generalized linear mixed model for logistic regression. We use backfitting within an iteratively reweighted penalized least square algorithm. The specific approach is a version of penalized quasi-likelihood due to Schall (1991). A straightforward version of Schall's algorithm would also cost more than $N^{3/2}$ because it requires the trace of the inverse of a large matrix. We approximate that quantity at cost $O(N)$ and prove that this substitution makes an asymptotically negligible difference. Our backfitting algorithm also collapses the fixed effect with one random effect at a time in a way that is analogous to the collapsed Gibbs sampler of Papaspiliopoulos et al. (2020). We use a symmetric operator that facilitates efficient covariance computation. We illustrate our method on a real dataset from Stitch Fix. By properly accounting for crossed random effects we show that a naive logistic regression could underestimate sampling variances by several hundred fold.

We propose an efficient, reliable, and interpretable global solution method, $\textit{Deep learning-based algorithm for Heterogeneous Agent Models, DeepHAM}$, for solving high dimensional heterogeneous agent models with aggregate shocks. The state distribution is approximately represented by a set of optimal generalized moments. Deep neural networks are used to approximate the value and policy functions, and the objective is optimized over directly simulated paths. Besides being an accurate global solver, this method has three additional features. First, it is computationally efficient for solving complex heterogeneous agent models, and it does not suffer from the curse of dimensionality. Second, it provides a general and interpretable representation of the distribution over individual states; and this is important for addressing the classical question of whether and how heterogeneity matters in macroeconomics. Third, it solves the constrained efficiency problem as easily as the competitive equilibrium, and this opens up new possibilities for studying optimal monetary and fiscal policies in heterogeneous agent models with aggregate shocks.

Variable selection is an important statistical problem. This problem becomes more challenging when the candidate predictors are of mixed type (e.g. continuous and binary) and impact the response variable in nonlinear and/or non-additive ways. In this paper, we review existing variable selection approaches for the Bayesian additive regression trees (BART) model, a nonparametric regression model, which is flexible enough to capture the interactions between predictors and nonlinear relationships with the response. An emphasis of this review is on the capability of identifying relevant predictors. We also propose two variable importance measures which can be used in a permutation-based variable selection approach, and a backward variable selection procedure for BART. We present simulations demonstrating that our approaches exhibit improved performance in terms of the ability to recover all the relevant predictors in a variety of data settings, compared to existing BART-based variable selection methods.

Modern-day problems in statistics often face the challenge of exploring and analyzing complex non-Euclidean object data that do not conform to vector space structures or operations. Examples of such data objects include covariance matrices, graph Laplacians of networks, and univariate probability distribution functions. In the current contribution a new concurrent regression model is proposed to characterize the time-varying relation between an object in a general metric space (as a response) and a vector in $\reals^p$ (as a predictor), where concepts from Fr\'echet regression is employed. Concurrent regression has been a well-developed area of research for Euclidean predictors and responses, with many important applications for longitudinal studies and functional data. However, there is no such model available so far for general object data as responses. We develop generalized versions of both global least squares regression and locally weighted least squares smoothing in the context of concurrent regression for responses that are situated in general metric spaces and propose estimators that can accommodate sparse and/or irregular designs. Consistency results are demonstrated for sample estimates of appropriate population targets along with the corresponding rates of convergence. The proposed models are illustrated with human mortality data and resting state functional Magnetic Resonance Imaging data (fMRI) as responses.

Existing work in counterfactual Learning to Rank (LTR) has focussed on optimizing feature-based models that predict the optimal ranking based on document features. LTR methods based on bandit algorithms often optimize tabular models that memorize the optimal ranking per query. These types of model have their own advantages and disadvantages. Feature-based models provide very robust performance across many queries, including those previously unseen, however, the available features often limit the rankings the model can predict. In contrast, tabular models can converge on any possible ranking through memorization. However, memorization is extremely prone to noise, which makes tabular models reliable only when large numbers of user interactions are available. Can we develop a robust counterfactual LTR method that pursues memorization-based optimization whenever it is safe to do? We introduce the Generalization and Specialization (GENSPEC) algorithm, a robust feature-based counterfactual LTR method that pursues per-query memorization when it is safe to do so. GENSPEC optimizes a single feature-based model for generalization: robust performance across all queries, and many tabular models for specialization: each optimized for high performance on a single query. GENSPEC uses novel relative high-confidence bounds to choose which model to deploy per query. By doing so, GENSPEC enjoys the high performance of successfully specialized tabular models with the robustness of a generalized feature-based model. Our results show that GENSPEC leads to optimal performance on queries with sufficient click data, while having robust behavior on queries with little or noisy data.

Discrete random structures are important tools in Bayesian nonparametrics and the resulting models have proven effective in density estimation, clustering, topic modeling and prediction, among others. In this paper, we consider nested processes and study the dependence structures they induce. Dependence ranges between homogeneity, corresponding to full exchangeability, and maximum heterogeneity, corresponding to (unconditional) independence across samples. The popular nested Dirichlet process is shown to degenerate to the fully exchangeable case when there are ties across samples at the observed or latent level. To overcome this drawback, inherent to nesting general discrete random measures, we introduce a novel class of latent nested processes. These are obtained by adding common and group-specific completely random measures and, then, normalising to yield dependent random probability measures. We provide results on the partition distributions induced by latent nested processes, and develop an Markov Chain Monte Carlo sampler for Bayesian inferences. A test for distributional homogeneity across groups is obtained as a by product. The results and their inferential implications are showcased on synthetic and real data.

北京阿比特科技有限公司