亚洲精品无码国产爽快A片百度_在线看片日中文福利免费_色欲91精品国产免费观看_国产真人一级A爱做片AA免费_精品久久久久中文字幕专区_亚洲精品无码成人片在线观看_国产激情视频一区二区在线观看

Background: Estimations of causal effects from observational data are subject to various sources of bias. These biases can be adjusted by using negative control outcomes not affected by the treatment. The empirical calibration procedure uses negative controls to calibrate p-values and both negative and positive controls to calibrate coverage of the 95% confidence interval of the outcome of interest. Although empirical calibration has been used in several large observational studies, there is no systematic examination of its effect under different bias scenarios. Methods: The effect of empirical calibration of confidence intervals was analyzed using simulated datasets with known treatment effects. The simulations were for binary treatment and binary outcome, with simulated biases resulting from unmeasured confounder, model misspecification, measurement error, and lack of positivity. The performance of empirical calibration was evaluated by determining the change of the confidence interval coverage and bias of the outcome of interest. Results: Empirical calibration increased coverage of the outcome of interest by the 95% confidence interval under most settings but was inconsistent in adjusting the bias of the outcome of interest. Empirical calibration was most effective when adjusting for unmeasured confounding bias. Suitable negative controls had a large impact on the adjustment made by empirical calibration, but small improvements in the coverage of the outcome of interest were also observable when using unsuitable negative controls. Conclusions: This work adds evidence to the efficacy of empirical calibration on calibrating the confidence intervals of treatment effects in observational studies. We recommend empirical calibration of confidence intervals, especially when there is a risk of unmeasured confounding.

相關內容

有偏

關注 0

估計/估計量 · Integration · MoDELS · 決策樹 · 簇 ·

2022 年 1 月 10 日

A Study on Mitigating Hard Boundaries of Decision-Tree-based Uncertainty Estimates for AI Models

Pascal Gerber,Lisa J?ckel,Michael Kl?s

from arxiv, Accepted for publication at Workshop on Artificial Intelligence Safety 2022 (SafeAI 2022), //safeai.webs.upv.es/

Outcomes of data-driven AI models cannot be assumed to be always correct. To estimate the uncertainty in these outcomes, the uncertainty wrapper framework has been proposed, which considers uncertainties related to model fit, input quality, and scope compliance. Uncertainty wrappers use a decision tree approach to cluster input quality related uncertainties, assigning inputs strictly to distinct uncertainty clusters. Hence, a slight variation in only one feature may lead to a cluster assignment with a significantly different uncertainty. Our objective is to replace this with an approach that mitigates hard decision boundaries of these assignments while preserving interpretability, runtime complexity, and prediction performance. Five approaches were selected as candidates and integrated into the uncertainty wrapper framework. For the evaluation based on the Brier score, datasets for a pedestrian detection use case were generated using the CARLA simulator and YOLOv3. All integrated approaches achieved a softening, i.e., smoothing, of uncertainty estimation. Yet, compared to decision trees, they are not so easy to interpret and have higher runtime complexity. Moreover, some components of the Brier score impaired while others improved. Most promising regarding the Brier score were random forests. In conclusion, softening hard decision tree boundaries appears to be a trade-off decision.

估計/估計量 · 邊緣化 · 統計效率 · 成對型 · 查準率/準確率 ·

2022 年 1 月 9 日

Target estimands for population-adjusted indirect comparisons

Antonio Remiro-Azócar

from arxiv, Small change to Figure 1. 13 pages (the main text is 8 pages), 1 figure. Submitted to Statistics in Medicine

Disagreement remains on what the target estimand should be for population-adjusted indirect treatment comparisons. This debate is of central importance for policy-makers and applied practitioners in health technology assessment. Misunderstandings are based on properties inherent to estimators, not estimands, and on generalizing conclusions based on linear regression to non-linear models. Estimators of marginal estimands need not be unadjusted and may be covariate-adjusted. The population-level interpretation of conditional estimates follows from collapsibility and does not necessarily hold for the underlying conditional estimands. For non-collapsible effect measures, neither conditional estimates nor estimands have a population-level interpretation. Estimators of marginal effects tend to be more precise and efficient than estimators of conditional effects where the measure of effect is non-collapsible. In any case, such comparisons are inconsequential for estimators targeting distinct estimands. Statistical efficiency should not drive the choice of the estimand. On the other hand, the estimand, selected on the basis of relevance to decision-making, should drive the choice of the most efficient estimator. Health technology assessment agencies make reimbursement decisions at the population level. Therefore, marginal estimands are required. Current pairwise population adjustment methods such as matching-adjusted indirect comparison are restricted to target marginal estimands that are specific to the comparator study sample. These may not be relevant for decision-making. Multilevel network meta-regression (ML-NMR) can potentially target marginal estimands in any population of interest. Such population could be characterized by decision-makers using increasingly available ``real-world'' data sources. Therefore, ML-NMR presents new directions and abundant opportunities for evidence synthesis.

估計/估計量 · 可約的 · 方差 · 無偏 · 統計量 ·

2022 年 1 月 9 日

Reducing bias and variance in quantile estimates with an exponential model

Rohit Pandey

Percentiles and more generally, quantiles are commonly used in various contexts to summarize data. For most distributions, there is exactly one quantile that is unbiased. For distributions like the Gaussian that have the same mean and median, that becomes the medians. There are different ways to estimate quantiles from finite samples described in the literature and implemented in statistics packages. It is possible to leverage the memory-less property of the exponential distribution and design high quality estimators that are unbiased and have low variance and mean squared errors. Naturally, these estimators out-perform the ones in statistical packages when the underlying distribution is exponential. But, they also happen to generalize well when that assumption is violated.

Continuity · MoDELS · 頻率主義學派 · 精確推斷 · 最大似然估計 ·

2022 年 1 月 7 日

Bayesian Cumulative Probability Models for Continuous and Mixed Outcomes

Nathan T. James,Frank E. Harrell Jr.,Bryan E. Shepherd

from arxiv, 30 pages, 25 figures

Ordinal cumulative probability models (CPMs) -- also known as cumulative link models -- such as the proportional odds regression model are typically used for discrete ordered outcomes, but can accommodate both continuous and mixed discrete/continuous outcomes since these are also ordered. Recent papers describe ordinal CPMs in this setting using non-parametric maximum likelihood estimation. We formulate a Bayesian CPM for continuous or mixed outcome data. Bayesian CPMs inherit many of the benefits of frequentist CPMs and have advantages with regard to interpretation, flexibility, and exact inference (within simulation error) for parameters and functions of parameters. We explore characteristics of the Bayesian CPM through simulations and a case study using HIV biomarker data. In addition, we provide the package 'bayesCPM' which implements Bayesian CPM models using the R interface to the Stan probabilistic programing language. The Bayesian CPM for continuous outcomes can be implemented with only minor modifications to the prior specification and, despite some limitations, has generally good statistical performance with moderate or large sample sizes.

樣本 · 可理解性 · 查準率/準確率 · 統計量 · 樣例 ·

2022 年 1 月 7 日

Power and Sample Size Calculations for Rerandomized Experiments

Zach Branson,Xinran Li,Peng Ding

from arxiv, 20 pages, 4 figures

Power is an important aspect of experimental design, because it allows researchers to understand the chance of detecting causal effects if they exist. It is common to specify a desired level of power, and then compute the sample size necessary to obtain that level of power; thus, power calculations help determine how experiments are conducted in practice. Power and sample size calculations are readily available for completely randomized experiments; however, there can be many benefits to using other experimental designs. For example, in recent years it has been established that rerandomized designs, where subjects are randomized until a prespecified level of covariate balance is obtained, increase the precision of causal effect estimators. This work establishes the statistical power of rerandomized treatment-control experiments, thereby allowing for sample size calculators. Our theoretical results also clarify how power and sample size are affected by treatment effect heterogeneity, a quantity that is often ignored in power analyses. Via simulation, we confirm our theoretical results and find that rerandomization can lead to substantial sample size reductions; e.g., in many realistic scenarios, rerandomization can lead to a 25% or even 50% reduction in sample size for a fixed level of power, compared to complete randomization. Power and sample size calculators based on our results are in the R package rerandPower on CRAN.

潛變量/隱變量 · 估計/估計量 · 統計量 · Facebook AI Research · 有向 ·

2022 年 1 月 6 日

Causal Mediation Analysis with Hidden Confounders

Lu Cheng,Ruocheng Guo,Huan Liu

from arxiv, 10 pages, 4 figures, accepted to WSDM'22

An important problem in causal inference is to break down the total effect of a treatment on an outcome into different causal pathways and to quantify the causal effect in each pathway. For instance, in causal fairness, the total effect of being a male employee (i.e., treatment) constitutes its direct effect on annual income (i.e., outcome) and the indirect effect via the employee's occupation (i.e., mediator). Causal mediation analysis (CMA) is a formal statistical framework commonly used to reveal such underlying causal mechanisms. One major challenge of CMA in observational studies is handling confounders, variables that cause spurious causal relationships among treatment, mediator, and outcome. Conventional methods assume sequential ignorability that implies all confounders can be measured, which is often unverifiable in practice. This work aims to circumvent the stringent sequential ignorability assumptions and consider hidden confounders. Drawing upon proxy strategies and recent advances in deep learning, we propose to simultaneously uncover the latent variables that characterize hidden confounders and estimate the causal effects. Empirical evaluations using both synthetic and semi-synthetic datasets validate the effectiveness of the proposed method. We further show the potentials of our approach for causal fairness analysis.

估計/估計量 · 參數化模型 · 泛函 · Performer · 特化 ·

2022 年 1 月 3 日

An extreme value approach to CoVaR estimation

Natalia Nolde,Chen Zhou,Menglin Zhou

from arxiv, 32 pages, 10 figures, 7 tables

The global financial crisis of 2007-2009 highlighted the crucial role systemic risk plays in ensuring stability of financial markets. Accurate assessment of systemic risk would enable regulators to introduce suitable policies to mitigate the risk as well as allow individual institutions to monitor their vulnerability to market movements. One popular measure of systemic risk is the conditional value-at-risk (CoVaR), proposed in Adrian and Brunnermeier (2011). We develop a methodology to estimate CoVaR semi-parametrically within the framework of multivariate extreme value theory. According to its definition, CoVaR can be viewed as a high quantile of the conditional distribution of one institution's (or the financial system) potential loss, where the conditioning event corresponds to having large losses in the financial system (or the given financial institution). We relate this conditional distribution to the tail dependence function between the system and the institution, then use parametric modelling of the tail dependence function to address data sparsity in the joint tail regions. We prove consistency of the proposed estimator, and illustrate its performance via simulation studies and a real data example.

估計/估計量 · 變換 · Performer · Transformer模型 · Extensibility ·

2021 年 12 月 27 日

Transformer Uncertainty Estimation with Hierarchical Stochastic Attention

Jiahuan Pei,Cheng Wang,Gy?rgy Szarvas

from arxiv, AAAI 2022

Transformers are state-of-the-art in a wide range of NLP tasks and have also been applied to many real-world products. Understanding the reliability and certainty of transformer model predictions is crucial for building trustable machine learning applications, e.g., medical diagnosis. Although many recent transformer extensions have been proposed, the study of the uncertainty estimation of transformer models is under-explored. In this work, we propose a novel way to enable transformers to have the capability of uncertainty estimation and, meanwhile, retain the original predictive performance. This is achieved by learning a hierarchical stochastic self-attention that attends to values and a set of learnable centroids, respectively. Then new attention heads are formed with a mixture of sampled centroids using the Gumbel-Softmax trick. We theoretically show that the self-attention approximation by sampling from a Gumbel distribution is upper bounded. We empirically evaluate our model on two text classification tasks with both in-domain (ID) and out-of-domain (OOD) datasets. The experimental results demonstrate that our approach: (1) achieves the best predictive performance and uncertainty trade-off among compared methods; (2) exhibits very competitive (in most cases, improved) predictive performance on ID datasets; (3) is on par with Monte Carlo dropout and ensemble methods in uncertainty estimation on OOD datasets.

隨機梯度下降 · 規范化的 · Batch Size · 優化器 · 寬度 ·

2019 年 5 月 9 日

The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study

Daniel S. Park,Jascha Sohl-Dickstein,Quoc V. Le,Samuel L. Smith

from arxiv, 17 pages, 3 tables, 17 figures; accepted to ICML 2019

We investigate how the final parameters found by stochastic gradient descent are influenced by over-parameterization. We generate families of models by increasing the number of channels in a base network, and then perform a large hyper-parameter search to study how the test error depends on learning rate, batch size, and network width. We find that the optimal SGD hyper-parameters are determined by a "normalized noise scale," which is a function of the batch size, learning rate, and initialization conditions. In the absence of batch normalization, the optimal normalized noise scale is directly proportional to width. Wider networks, with their higher optimal noise scale, also achieve higher test accuracy. These observations hold for MLPs, ConvNets, and ResNets, and for two different parameterization schemes ("Standard" and "NTK"). We observe a similar trend with batch normalization for ResNets. Surprisingly, since the largest stable learning rate is bounded, the largest batch size consistent with the optimal normalized noise scale decreases as the width increases.

變分自編碼 · MoDELS · 可理解性 · 精確推斷 · 模式崩潰 ·

2018 年 12 月 13 日

A Probe into Understanding GAN and VAE models

Jingzhao Zhang,Lu Mi,Macheng Shen

from arxiv, 9 pages, 8 figures

Both generative adversarial network models and variational autoencoders have been widely used to approximate probability distributions of datasets. Although they both use parametrized distributions to approximate the underlying data distribution, whose exact inference is intractable, their behaviors are very different. In this report, we summarize our experiment results that compare these two categories of models in terms of fidelity and mode collapse. We provide a hypothesis to explain their different behaviors and propose a new model based on this hypothesis. We further tested our proposed model on MNIST dataset and CelebA dataset.