亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Causal decision making (CDM) based on machine learning has become a routine part of business. Businesses algorithmically target offers, incentives, and recommendations to affect consumer behavior. Recently, we have seen an acceleration of research related to CDM and causal effect estimation (CEE) using machine-learned models. This article highlights an important perspective: CDM is not the same as CEE, and counterintuitively, accurate CEE is not necessary for accurate CDM. Our experience is that this is not well understood by practitioners or most researchers. Technically, the estimand of interest is different, and this has important implications both for modeling and for the use of statistical models for CDM. We draw on prior research to highlight three implications. (1) We should consider carefully the objective function of the causal machine learning, and if possible, optimize for accurate treatment assignment rather than for accurate effect-size estimation. (2) Confounding does not have the same effect on CDM as it does on CEE. The upshot is that for supporting CDM it may be just as good or even better to learn with confounded data as with unconfounded data. Finally, (3) causal statistical modeling may not be necessary to support CDM because a proxy target for statistical modeling might do as well or better. This third observation helps to explain at least one broad common CDM practice that seems wrong at first blush: the widespread use of non-causal models for targeting interventions. The last two implications are particularly important in practice, as acquiring (unconfounded) data on all counterfactuals can be costly and often impracticable. These observations open substantial research ground. We hope to facilitate research in this area by pointing to related articles from multiple contributing fields, including two dozen articles published the last three to four years.

相關內容

Clinical studies often encounter with truncation-by-death problems, which may render the outcomes undefined. Statistical analysis based only on observed survivors may lead to biased results because the characters of survivors may differ greatly between treatment groups. Under the principal stratification framework, a meaningful causal parameter, the survivor average causal effect, in the always-survivor group can be defined. This causal parameter may not be identifiable in observational studies where the treatment assignment and the survival or outcome process are confounded by unmeasured features. In this paper, we propose a new method to deal with unmeasured confounding when the outcome is truncated by death. First, a new method is proposed to identify the heterogeneous conditional survivor average causal effect based on a substitutional variable under monotonicity. Second, under additional assumptions, the survivor average causal effect on the overall population is also identified. Furthermore, we consider estimation and inference for the conditional survivor average causal effect based on parametric and nonparametric methods with good asymptotic properties. Good finite-sample properties are demonstrated by simulation and sensitivity analysis. The proposed method is applied to investigate the effect of allogeneic stem cell transplantation types on leukemia relapse.

Proximal causal inference was recently proposed as a framework to identify causal effects from observational data in the presence of hidden confounders for which proxies are available. In this paper, we extend the proximal causal approach to settings where identification of causal effects hinges upon a set of mediators which unfortunately are not directly observed, however proxies of the hidden mediators are measured. Specifically, we establish (i) a new hidden front-door criterion which extends the classical front-door result to allow for hidden mediators for which proxies are available; (ii) We extend causal mediation analysis to identify direct and indirect causal effects under unconfoundedness conditions in a setting where the mediator in view is hidden, but error prone proxies of the latter are available. We view (i) and (ii) as important steps towards the practical application of front-door criteria and mediation analysis as mediators are almost always error prone and thus, the most one can hope for in practice is that our measurements are at best proxies of mediating mechanisms. Finally, we show that identification of certain causal effects remains possible even in settings where challenges in (i) and (ii) might co-exist.

When teaching and discussing statistical assumptions, our focus is oftentimes placed on how to test and address potential violations rather than the effects of violating assumptions on the estimates produced by our statistical models. The latter represents a potential avenue to help us better understand the impact of researcher degrees of freedom on the statistical estimates we produce. The Violating Assumptions Series is an endeavor I have undertaken to demonstrate the effects of violating assumptions on the estimates produced across various statistical models. The series will review assumptions associated with estimating causal associations, as well as more complicated statistical models including, but not limited to, multilevel models, path models, structural equation models, and Bayesian models. In addition to the primary goal, the series of posts is designed to illustrate how simulations can be used to develop a comprehensive understanding of applied statistics.

In clinical research, the effect of a treatment or intervention is widely assessed through clinical importance, instead of statistical significance. In this paper, we propose a principled statistical inference framework to learning the minimal clinically important difference (MCID), a vital concept in assessing clinical importance. We formulate the scientific question into a novel statistical learning problem, develop an efficient algorithm for parameter estimation, and establish the asymptotic theory for the proposed estimator. We conduct comprehensive simulation studies to examine the finite sample performance of the proposed method. We also re-analyze the ChAMP (Chondral Lesions And Meniscus Procedures) trial, where the primary outcome is the patient-reported pain score and the ultimate goal is to determine whether there exists a significant difference in post-operative knee pain between patients undergoing debridement versus observation of chondral lesions during the surgery. Some previous analysis of this trial exhibited that the effect of debriding the chondral lesions does not reach a statistical significance. Our analysis reinforces this conclusion that the effect of debriding the chondral lesions is not only statistically non-significant, but also clinically un-important.

Recently there has been increased interest in using machine learning techniques to improve classical algorithms. In this paper we study when it is possible to construct compact, composable sketches for weighted sampling and statistics estimation according to functions of data frequencies. Such structures are now central components of large-scale data analytics and machine learning pipelines. However, many common functions, such as thresholds and p-th frequency moments with p > 2, are known to require polynomial-size sketches in the worst case. We explore performance beyond the worst case under two different types of assumptions. The first is having access to noisy advice on item frequencies. This continues the line of work of Hsu et al. (ICLR 2019), who assume predictions are provided by a machine learning model. The second is providing guaranteed performance on a restricted class of input frequency distributions that are better aligned with what is observed in practice. This extends the work on heavy hitters under Zipfian distributions in a seminal paper of Charikar et al. (ICALP 2002). Surprisingly, we show analytically and empirically that "in practice" small polylogarithmic-size sketches provide accuracy for "hard" functions.

A fundamental goal of scientific research is to learn about causal relationships. However, despite its critical role in the life and social sciences, causality has not had the same importance in Natural Language Processing (NLP), which has traditionally placed more emphasis on predictive tasks. This distinction is beginning to fade, with an emerging area of interdisciplinary research at the convergence of causal inference and language processing. Still, research on causality in NLP remains scattered across domains without unified definitions, benchmark datasets and clear articulations of the remaining challenges. In this survey, we consolidate research across academic areas and situate it in the broader NLP landscape. We introduce the statistical challenge of estimating causal effects, encompassing settings where text is used as an outcome, treatment, or as a means to address confounding. In addition, we explore potential uses of causal inference to improve the performance, robustness, fairness, and interpretability of NLP models. We thus provide a unified overview of causal inference for the computational linguistics community.

Click-through rate (CTR) estimation plays as a core function module in various personalized online services, including online advertising, recommender systems, and web search etc. From 2015, the success of deep learning started to benefit CTR estimation performance and now deep CTR models have been widely applied in many industrial platforms. In this survey, we provide a comprehensive review of deep learning models for CTR estimation tasks. First, we take a review of the transfer from shallow to deep CTR models and explain why going deep is a necessary trend of development. Second, we concentrate on explicit feature interaction learning modules of deep CTR models. Then, as an important perspective on large platforms with abundant user histories, deep behavior models are discussed. Moreover, the recently emerged automated methods for deep CTR architecture design are presented. Finally, we summarize the survey and discuss the future prospects of this field.

This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.

Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy and economics, for decades. Nowadays, estimating causal effect from observational data has become an appealing research direction owing to the large amount of available data and low budget requirement, compared with randomized controlled trials. Embraced with the rapidly developed machine learning area, various causal effect estimation methods for observational data have sprung up. In this survey, we provide a comprehensive review of causal inference methods under the potential outcome framework, one of the well known causal inference framework. The methods are divided into two categories depending on whether they require all three assumptions of the potential outcome framework or not. For each category, both the traditional statistical methods and the recent machine learning enhanced methods are discussed and compared. The plausible applications of these methods are also presented, including the applications in advertising, recommendation, medicine and so on. Moreover, the commonly used benchmark datasets as well as the open-source codes are also summarized, which facilitate researchers and practitioners to explore, evaluate and apply the causal inference methods.

Many current applications use recommendations in order to modify the natural user behavior, such as to increase the number of sales or the time spent on a website. This results in a gap between the final recommendation objective and the classical setup where recommendation candidates are evaluated by their coherence with past user behavior, by predicting either the missing entries in the user-item matrix, or the most likely next event. To bridge this gap, we optimize a recommendation policy for the task of increasing the desired outcome versus the organic user behavior. We show this is equivalent to learning to predict recommendation outcomes under a fully random recommendation policy. To this end, we propose a new domain adaptation algorithm that learns from logged data containing outcomes from a biased recommendation policy and predicts recommendation outcomes according to random exposure. We compare our method against state-of-the-art factorization methods, in addition to new approaches of causal recommendation and show significant improvements.

北京阿比特科技有限公司