亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

For predictive evaluation based on quasi-posterior distributions, we develop a new information criterion, the posterior covariance information criterion (PCIC. PCIC generalises the widely applicable information criterion WAIC so as to effectively handle predictive scenarios where likelihoods for the estimation and the evaluation of the model may be different. A typical example of such scenarios is the weighted likelihood inference, including prediction under covariate shift and counterfactual prediction. The proposed criterion utilises a posterior covariance form and is computed by using only one Markov chain Monte Carlo run. Through numerical examples, we demonstrate how PCIC can apply in practice. Further, we show that PCIC is asymptotically unbiased to the quasi-Bayesian generalization error under mild conditions in weighted inference with both regular and singular statistical models.

相關內容

《計算機信息》雜志發表高質量的論文,擴大了運籌學和計算的范圍,尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文,以及描述新的和有用的軟件工具的論文。官網鏈接: · 推斷 · 均值 · 樣本 · Analysis ·
2023 年 3 月 17 日

This paper considers the problem of inference in cluster randomized experiments when cluster sizes are non-ignorable. Here, by a cluster randomized experiment, we mean one in which treatment is assigned at the level of the cluster; by non-ignorable cluster sizes we mean that "large'' clusters and "small'' clusters may be heterogeneous, and, in particular, the effects of the treatment may vary across clusters of differing sizes. In order to permit this sort of flexibility, we consider a sampling framework in which cluster sizes themselves are random. In this way, our analysis departs from earlier analyses of cluster randomized experiments in which cluster sizes are treated as non-random. We distinguish between two different parameters of interest: the equally-weighted cluster-level average treatment effect, and the size-weighted cluster-level average treatment effect. For each parameter, we provide methods for inference in an asymptotic framework where the number of clusters tends to infinity and treatment is assigned using a covariate-adaptive stratified randomization procedure. We additionally permit the experimenter to sample only a subset of the units within each cluster rather than the entire cluster and demonstrate the implications of such sampling for some commonly used estimators. A small simulation study and empirical demonstration show the practical relevance of our theoretical results.

A new nonparametric estimator for Toeplitz covariance matrices is proposed. This estimator is based on a data transformation that translates the problem of Toeplitz covariance matrix estimation to the problem of mean estimation in an approximate Gaussian regression. The resulting Toeplitz covariance matrix estimator is positive definite by construction, fully data-driven and computationally very fast. Moreover, this estimator is shown to be minimax optimal under the spectral norm for a large class of Toeplitz matrices. These results are readily extended to estimation of inverses of Toeplitz covariance matrices. Also, an alternative version of the Whittle likelihood for the spectral density based on the Discrete Cosine Transform (DCT) is proposed. The method is implemented in the R package vstdct that accompanies the paper.

Selection of covariates is crucial in the estimation of average treatment effects given observational data with high or even ultra-high dimensional pretreatment variables. Existing methods for this problem typically assume sparse linear models for both outcome and univariate treatment, and cannot handle situations with ultra-high dimensional covariates. In this paper, we propose a new covariate selection strategy called double screening prior adaptive lasso (DSPAL) to select confounders and predictors of the outcome for multivariate treatments, which combines the adaptive lasso method with the marginal conditional (in)dependence prior information to select target covariates, in order to eliminate confounding bias and improve statistical efficiency. The distinctive features of our proposal are that it can be applied to high-dimensional or even ultra-high dimensional covariates for multivariate treatments, and can deal with the cases of both parametric and nonparametric outcome models, which makes it more robust compared to other methods. Our theoretical analyses show that the proposed procedure enjoys the sure screening property, the ranking consistency property and the variable selection consistency. Through a simulation study, we demonstrate that the proposed approach selects all confounders and predictors consistently and estimates the multivariate treatment effects with smaller bias and mean squared error compared to several alternatives under various scenarios. In real data analysis, the method is applied to estimate the causal effect of a three-dimensional continuous environmental treatment on cholesterol level and enlightening results are obtained.

The linear regression model is widely used in the biomedical and social sciences as well as in policy and business research to adjust for covariates and estimate the average effects of treatments. Behind every causal inference endeavor there is at least a notion of a randomized experiment. However, in routine regression analyses in observational studies, it is unclear how well the adjustments made by regression approximate key features of randomization experiments, such as covariate balance, study representativeness, sample boundedness, and unweighted sampling. In this paper, we provide software to empirically address this question. In the new lmw package for R, we compute the implied linear model weights for average treatment effects and provide diagnostics for them. The weights are obtained as part of the design stage of the study; that is, without using outcome information. The implementation is general and applicable, for instance, in settings with instrumental variables and multi-valued treatments; in essence, in any situation where the linear model is the vehicle for adjustment and estimation of average treatment effects with discrete-valued interventions.

Genito-Pelvic Pain/Penetration-Disorder (GPPPD) is a common disorder but rarely treated in routine care. Previous research documents that GPPPD symptoms can be treated effectively using internet-based psychological interventions. However, non-response remains common for all state-of-the-art treatments and it is unclear which patient groups are expected to benefit most from an internet-based intervention. Multivariable prediction models are increasingly used to identify predictors of heterogeneous treatment effects, and to allocate treatments with the greatest expected benefits. In this study, we developed and internally validated a multivariable decision tree model that predicts effects of an internet-based treatment on a multidimensional composite score of GPPPD symptoms. Data of a randomized controlled trial comparing the internet-based intervention to a waitlist control group (N =200) was used to develop a decision tree model using model-based recursive partitioning. Model performance was assessed by examining the apparent and bootstrap bias-corrected performance. The final pruned decision tree consisted of one splitting variable, joint dyadic coping, based on which two response clusters emerged. No effect was found for patients with low dyadic coping ($n$=33; $d$=0.12; 95% CI: -0.57-0.80), while large effects ($d$=1.00; 95%CI: 0.68-1.32; $n$=167) are predicted for those with high dyadic coping at baseline. The bootstrap-bias-corrected performance of the model was $R^2$=27.74% (RMSE=13.22).

This work formulates a new approach to reduced modeling of parameterized, time-dependent partial differential equations (PDEs). The method employs Operator Inference, a scientific machine learning framework combining data-driven learning and physics-based modeling. The parametric structure of the governing equations is embedded directly into the reduced-order model, and parameterized reduced-order operators are learned via a data-driven linear regression problem. The result is a reduced-order model that can be solved rapidly to map parameter values to approximate PDE solutions. Such parameterized reduced-order models may be used as physics-based surrogates for uncertainty quantification and inverse problems that require many forward solves of parametric PDEs. Numerical issues such as well-posedness and the need for appropriate regularization in the learning problem are considered, and an algorithm for hyperparameter selection is presented. The method is illustrated for a parametric heat equation and demonstrated for the FitzHugh-Nagumo neuron model.

We develop and analyze a principled approach to kernel ridge regression under covariate shift. The goal is to learn a regression function with small mean squared error over a target distribution, based on unlabeled data from there and labeled data that may have a different feature distribution. We propose to split the labeled data into two subsets and conduct kernel ridge regression on them separately to obtain a collection of candidate models and an imputation model. We use the latter to fill the missing labels and then select the best candidate model accordingly. Our non-asymptotic excess risk bounds show that in quite general scenarios, our estimator adapts to the structure of the target distribution as well as the covariate shift. It achieves the minimax optimal error rate up to a logarithmic factor. The use of pseudo-labels in model selection does not have major negative impacts.

Causal inference in spatial settings is met with unique challenges and opportunities. On one hand, a unit's outcome can be affected by the exposure at many locations, leading to interference. On the other hand, unmeasured spatial variables can confound the effect of interest. Our work has two overarching goals. First, using causal diagrams, we illustrate that spatial confounding and interference can manifest as each other, meaning that investigating the presence of one can lead to wrongful conclusions in the presence of the other, and that statistical dependencies in the exposure variable can render standard analyses invalid. This can have crucial implications for analyzing data with spatial or other dependencies, and for understanding the effect of interventions on dependent units. Secondly, we propose a parametric approach to mitigate bias from local and neighborhood unmeasured spatial confounding and account for interference simultaneously. This approach is based on simultaneous modeling of the exposure and the outcome while accounting for the presence of spatially-structured unmeasured predictors of both variables. We illustrate our approach with a simulation study and with an analysis of the local and interference effects of sulfur dioxide emissions from power plants on cardiovascular mortality.

The generalized inverse Gaussian-Poisson (GIGP) distribution proposed by Sichel in the 1970s has proved to be a flexible fitting tool for diverse frequency data, collectively described using the item production model. In this paper, we identify the limit shape (specified as an incomplete gamma function) of the properly scaled diagrammatic representations of random samples from the GIGP distribution (known as Young diagrams). We also show that fluctuations are asymptotically normal and, moreover, the corresponding empirical random process is approximated via a rescaled Brownian motion in inverted time, with the inhomogeneous time scale determined by the limit shape. Here, the limit is taken as the number of production sources is growing to infinity, coupled with an intrinsic parameter regime ensuring that the mean number of items per source is large. More precisely, for convergence to the limit shape to be valid, this combined growth should be fast enough. In the opposite regime referred to as "chaotic", the empirical random process is approximated by means of an inhomogeneous Poisson process in inverted time. These results are illustrated using both computer simulations and some classic data sets in informetrics.

Knowledge graphs represent factual knowledge about the world as relationships between concepts and are critical for intelligent decision making in enterprise applications. New knowledge is inferred from the existing facts in the knowledge graphs by encoding the concepts and relations into low-dimensional feature vector representations. The most effective representations for this task, called Knowledge Graph Embeddings (KGE), are learned through neural network architectures. Due to their impressive predictive performance, they are increasingly used in high-impact domains like healthcare, finance and education. However, are the black-box KGE models adversarially robust for use in domains with high stakes? This thesis argues that state-of-the-art KGE models are vulnerable to data poisoning attacks, that is, their predictive performance can be degraded by systematically crafted perturbations to the training knowledge graph. To support this argument, two novel data poisoning attacks are proposed that craft input deletions or additions at training time to subvert the learned model's performance at inference time. These adversarial attacks target the task of predicting the missing facts in knowledge graphs using KGE models, and the evaluation shows that the simpler attacks are competitive with or outperform the computationally expensive ones. The thesis contributions not only highlight and provide an opportunity to fix the security vulnerabilities of KGE models, but also help to understand the black-box predictive behaviour of KGE models.

北京阿比特科技有限公司