亚洲成AV人片乱码色午夜刚交,日本成年黄色一区二区三区

Estimating individualized treatment effects (ITEs) from observational data is crucial for decision-making. In order to obtain unbiased ITE estimates, a common assumption is that all confounders are observed. However, in practice, it is unlikely that we observe these confounders directly. Instead, we often observe noisy measurements of true confounders, which can serve as valid proxies. In this paper, we address the problem of estimating ITE in the longitudinal setting where we observe noisy proxies instead of true confounders. To this end, we develop the Deconfounding Temporal Autoencoder, a novel method that leverages observed noisy proxies to learn a hidden embedding that reflects the true hidden confounders. In particular, the DTA combines a long short-term memory autoencoder with a causal regularization penalty that renders the potential outcomes and treatment assignment conditionally independent given the learned hidden embedding. Once the hidden embedding is learned via DTA, state-of-the-art outcome models can be used to control for it and obtain unbiased estimates of ITE. Using synthetic and real-world medical data, we demonstrate the effectiveness of our DTA by improving over state-of-the-art benchmarks by a substantial margin.

相關內容

估計/估計量

關注 3

可辨認的 · Processing（編程語言） · 潛變量/隱變量 · 潛在 · 學成 ·

2022 年 2 月 8 日

Learning Temporally Causal Latent Processes from General Temporal Data

Weiran Yao,Yuewen Sun,Alex Ho,Changyin Sun,Kun Zhang

from arxiv, ICLR 2022: //openreview.net/forum?id=RDlLMjLJXdq

Our goal is to recover time-delayed latent causal variables and identify their relations from measured temporal data. Estimating causally-related latent variables from observations is particularly challenging as the latent variables are not uniquely recoverable in the most general case. In this work, we consider both a nonparametric, nonstationary setting and a parametric setting for the latent processes and propose two provable conditions under which temporally causal latent processes can be identified from their nonlinear mixtures. We propose LEAP, a theoretically-grounded framework that extends Variational AutoEncoders (VAEs) by enforcing our conditions through proper constraints in causal process prior. Experimental results on various datasets demonstrate that temporally causal latent processes are reliably identified from observed variables under different dependency structures and that our approach considerably outperforms baselines that do not properly leverage history or nonstationarity information. This demonstrates that using temporal information to learn latent processes from their invertible nonlinear mixtures in an unsupervised manner, for which we believe our work is one of the first, seems promising even without sparsity or minimality assumptions.

可辨認的 · 因果因子 · 分解的 · 標量 · 泛化理論 ·

2022 年 2 月 7 日

CITRIS: Causal Identifiability from Temporal Intervened Sequences

Phillip Lippe,Sara Magliacane,Sindy L?we,Yuki M. Asano,Taco Cohen,Efstratios Gavves

from arxiv, 46 pages

Understanding the latent causal factors of a dynamical system from visual observations is a crucial step towards agents reasoning in complex environments. In this paper, we propose CITRIS, a variational autoencoder framework that learns causal representations from temporal sequences of images in which underlying causal factors have possibly been intervened upon. In contrast to the recent literature, CITRIS exploits temporality and observing intervention targets to identify scalar and multidimensional causal factors, such as 3D rotation angles. Furthermore, by introducing a normalizing flow, CITRIS can be easily extended to leverage and disentangle representations obtained by already pretrained autoencoders. Extending previous results on scalar causal factors, we prove identifiability in a more general setting, in which only some components of a causal factor are affected by interventions. In experiments on 3D rendered image sequences, CITRIS outperforms previous methods on recovering the underlying causal variables. Moreover, using pretrained autoencoders, CITRIS can even generalize to unseen instantiations of causal factors, opening future research areas in sim-to-real generalization for causal representation learning.

估計/估計量 · INFORMS · 得分 · MoDELS · 可辨認的 ·

2022 年 2 月 7 日

Information projection approach to propensity score estimation for handling missing data

Hengfang Wang,Jae Kwang Kim

Missing data is frequently encountered in practice. Propensity score estimation is a popular tool for handling such missingness. The propensity score is often developed using a model for the response probability, which can be subject to model misspecification. In this paper, we consider an alternative approach of estimating the inverse of the propensity scores using the density ratio function. The smoothed density ratio function is obtained by the solution to the information projection onto the space satisfying the moment conditions on the balancing scores. By including the covariates for the outcome regression models only into the density ratio model, we can achieve efficient propensity score estimation. Penalized regression is used to identify important covariates. We further extend the proposed approach to the multivariate missing case. Some limited simulation studies are presented to compare with the existing methods.

泛函 · 估計/估計量 · 統計量 · 振蕩 · 無限 ·

2022 年 2 月 5 日

Break Point Detection for Functional Covariance

Shuhao Jiao,Ron D. Frostig,Hernando Ombao

Many experiments record sequential trajectories where each trajectory consists of oscillations and fluctuations around zero. Such trajectories can be viewed as zero-mean functional data. When there are structural breaks (on the sequence of trajectories) in higher order moments, it is not always easy to spot these by mere visual inspection. Motivated by this challenging problem in brain signal analysis, we propose a detection and testing procedure to find the change point in functional covariance. The detection procedure is based on the cumulative sum statistics (CUSUM). The classical testing procedure for functional data depends on a null distribution which depends on infinitely many unknown parameters, though in practice only a finite number of these can be included for the hypothesis test of the existence of change point. This paper provides some theoretical insights on the influence of the number of parameters. Meanwhile, the asymptotic properties of the estimated change point are developed. The effectiveness of the proposed method is numerically validated in simulation studies and an application to investigate changes in rat brain signals following an experimentally-induced stroke.

可辨認的 · Processing（編程語言） · 相互獨立的 · 推斷 · 重賦權 ·

2022 年 2 月 4 日

Graphical criteria for the identification of marginal causal effects in continuous-time survival and event-history analyses

Kjetil R?ysland,P?l Ryalen,Mari Nyg?rd,Vanessa Didelez

from arxiv, 45 pages, 3 figures

We consider continuous-time survival or more general event-history settings, where the aim is to infer the causal effect of a time-dependent treatment process. This is formalised as the effect on the outcome event of a (possibly hypothetical) intervention on the intensity of the treatment process, i.e. a stochastic intervention. To establish whether valid inference about the interventional situation can be drawn from typical observational, i.e. non-experimental, data we propose graphical rules indicating whether the observed information is sufficient to identify the desired causal effect by suitable re-weighting. In analogy to the well-known causal directed acyclic graphs, the corresponding dynamic graphs combine causal semantics with local independence models for multivariate counting processes. Importantly, we highlight that causal inference from censored data requires structural assumptions on the censoring process beyond the usual independent censoring assumption, which can be represented and verified graphically. Our results establish general non-parametric identifiability and do not rely on particular survival models. We illustrate our proposal with a data example on HPV-testing for cervical cancer screening, where the desired effect is estimated by re-weighted cumulative incidence curves.

估計/估計量 · INFORMS · 有偏 · 可辨認的 · Performer ·

2022 年 2 月 4 日

To Impute or not to Impute? -- Missing Data in Treatment Effect Estimation

Jeroen Berrevoets,Fergus Imrie,Trent Kyono,James Jordon,Mihaela van der Schaar

Missing data is a systemic problem in practical scenarios that causes noise and bias when estimating treatment effects. This makes treatment effect estimation from data with missingness a particularly tricky endeavour. A key reason for this is that standard assumptions on missingness are rendered insufficient due to the presence of an additional variable, treatment, besides the individual and the outcome. Having a treatment variable introduces additional complexity with respect to why some variables are missing that is not fully explored by previous work. In our work we identify a new missingness mechanism, which we term mixed confounded missingness (MCM), where some missingness determines treatment selection and other missingness is determined by treatment selection. Given MCM, we show that naively imputing all data leads to poor performing treatment effects models, as the act of imputation effectively removes information necessary to provide unbiased estimates. However, no imputation at all also leads to biased estimates, as missingness determined by treatment divides the population in distinct subpopulations, where estimates across these populations will be biased. Our solution is selective imputation, where we use insights from MCM to inform precisely which variables should be imputed and which should not. We empirically demonstrate how various learners benefit from selective imputation compared to other solutions for missing data.

估計/估計量 · 變換 · Extensibility · 有偏 · 歸納偏好 ·

2022 年 2 月 4 日

Can Transformers be Strong Treatment Effect Estimators?

Yi-Fan Zhang,Hanlin Zhang,Zachary C. Lipton,Li Erran Li,Eric P. Xing

from arxiv, Technical Report. The first two authors contributed equally to this work

In this paper, we develop a general framework based on the Transformer architecture to address a variety of challenging treatment effect estimation (TEE) problems. Our methods are applicable both when covariates are tabular and when they consist of sequences (e.g., in text), and can handle discrete, continuous, structured, or dosage-associated treatments. While Transformers have already emerged as dominant methods for diverse domains, including natural language and computer vision, our experiments with Transformers as Treatment Effect Estimators (TransTEE) demonstrate that these inductive biases are also effective on the sorts of estimation problems and datasets that arise in research aimed at estimating causal effects. Moreover, we propose a propensity score network that is trained with TransTEE in an adversarial manner to promote independence between covariates and treatments to further address selection bias. Through extensive experiments, we show that TransTEE significantly outperforms competitive baselines with greater parameter efficiency over a wide range of benchmarks and settings.

估計/估計量 · 流形 · Weight · 噪聲 · 流形學習 ·

2022 年 2 月 4 日

Structure-adaptive manifold estimation

Nikita Puchkin,Vladimir Spokoiny

from arxiv, 62 pages, 2 tables, 3 figures

We consider a problem of manifold estimation from noisy observations. Many manifold learning procedures locally approximate a manifold by a weighted average over a small neighborhood. However, in the presence of large noise, the assigned weights become so corrupted that the averaged estimate shows very poor performance. We suggest a structure-adaptive procedure, which simultaneously reconstructs a smooth manifold and estimates projections of the point cloud onto this manifold. The proposed approach iteratively refines the weights on each step, using the structural information obtained at previous steps. After several iterations, we obtain nearly "oracle" weights, so that the final estimates are nearly efficient even in the presence of relatively large noise. In our theoretical study, we establish tight lower and upper bounds proving asymptotic optimality of the method for manifold estimation under the Hausdorff loss, provided that the noise degrades to zero fast enough.

估計/估計量 · 最大似然估計 · 似然 · ROC · 極大似然 ·

2022 年 2 月 4 日

Maximum Likelihood Estimation of Optimal Receiver Operating Characteristic Curves from Likelihood Ratio Observations

Bruce Hajek,Xiaohan Kang

The optimal receiver operating characteristic (ROC) curve, giving the maximum probability of detection as a function of the probability of false alarm, is a key information-theoretic indicator of the difficulty of a binary hypothesis testing problem (BHT). It is well known that the optimal ROC curve for a given BHT, corresponding to the likelihood ratio test, is theoretically determined by the probability distribution of the observed data under each of the two hypotheses. In some cases, these two distributions may be unknown or computationally intractable, but independent samples of the likelihood ratio can be observed. This raises the problem of estimating the optimal ROC for a BHT from such samples. The maximum likelihood estimator of the optimal ROC curve is derived, and it is shown to converge to the true optimal ROC curve in the \levy\ metric, as the number of observations tends to infinity. A classical empirical estimator, based on estimating the two types of error probabilities from two separate sets of samples, is also considered. The maximum likelihood estimator is observed in simulation experiments to be considerably more accurate than the empirical estimator, especially when the number of samples obtained under one of the two hypotheses is small. The area under the maximum likelihood estimator is derived; it is a consistent estimator of the true area under the optimal ROC curve.

似然 · 估計/估計量 · 最大似然估計 · 極大似然 · MoDELS ·

2018 年 9 月 24 日

Implicit Maximum Likelihood Estimation

Ke Li,Jitendra Malik

from arxiv, 21 pages, 4 figures. In the interest of promoting discussion, we make the reviews available at //people.eecs.berkeley.edu/~ke.li/papers/imle_reviews.pdf

Implicit probabilistic models are models defined naturally in terms of a sampling procedure and often induces a likelihood function that cannot be expressed explicitly. We develop a simple method for estimating parameters in implicit models that does not require knowledge of the form of the likelihood function or any derived quantities, but can be shown to be equivalent to maximizing likelihood under some conditions. Our result holds in the non-asymptotic parametric setting, where both the capacity of the model and the number of data examples are finite. We also demonstrate encouraging experimental results.