国产特级黄色片A级无毛视频_亚洲AV永久无码精品九之_差差差带痛声视频_亚洲国产中文成人挑花视频_欧美情视频在线一区二区_91精品国产91久久久久蜜臀_国产激情3视频一区二区对白

Tianchen Qian,Ashley E. Walton,Linda M. Collins,Predrag Klasnja,Stephanie T. Lanza,Inbal Nahum-Shani,Mashifiqui Rabbi,Michael A. Russell,Maureen A. Walton,Hyesun Yoo,Susan A. Murphy

from arxiv, arXiv admin note: substantial text overlap with arXiv:2005.05880, arXiv:2004.10241

Just-in-time adaptive interventions (JITAIs) are time-varying adaptive interventions that use frequent opportunities for the intervention to be adapted--weekly, daily, or even many times a day. The micro-randomized trial (MRT) has emerged for use in informing the construction of JITAIs. MRTs can be used to address research questions about whether and under what circumstances JITAI components are effective, with the ultimate objective of developing effective and efficient JITAI. The purpose of this article is to clarify why, when, and how to use MRTs; to highlight elements that must be considered when designing and implementing an MRT; and to review primary and secondary analyses methods for MRTs. We briefly review key elements of JITAIs and discuss a variety of considerations that go into planning and designing an MRT. We provide a definition of causal excursion effects suitable for use in primary and secondary analyses of MRT data to inform JITAI development. We review the weighted and centered least-squares (WCLS) estimator which provides consistent causal excursion effect estimators from MRT data. We describe how the WCLS estimator along with associated test statistics can be obtained using standard statistical software such as R (R Core Team, 2019). Throughout we illustrate the MRT design and analyses using the HeartSteps MRT, for developing a JITAI to increase physical activity among sedentary individuals. We supplement the HeartSteps MRT with two other MRTs, SARA and BariFit, each of which highlights different research questions that can be addressed using the MRT and experimental design considerations that might arise.

相關內容

估(gu)計(ji)/估(gu)計(ji)量(liang)

關注 3

Extensibility · 統計量 · 可約的 · INFORMS · Performer ·

2022 年 1 月 28 日

Statistical anonymity: Quantifying reidentification risks without reidentifying users

Gecia Bravo-Hermsdorff,Robert Busa-Fekete,Lee M. Gunderson,Andrés Mun?z Medina,Umar Syed

Data anonymization is an approach to privacy-preserving data release aimed at preventing participants reidentification, and it is an important alternative to differential privacy in applications that cannot tolerate noisy data. Existing algorithms for enforcing $k$-anonymity in the released data assume that the curator performing the anonymization has complete access to the original data. Reasons for limiting this access range from undesirability to complete infeasibility. This paper explores ideas -- objectives, metrics, protocols, and extensions -- for reducing the trust that must be placed in the curator, while still maintaining a statistical notion of $k$-anonymity. We suggest trust (amount of information provided to the curator) and privacy (anonymity of the participants) as the primary objectives of such a framework. We describe a class of protocols aimed at achieving these goals, proposing new metrics of privacy in the process, and proving related bounds. We conclude by discussing a natural extension of this work that completely removes the need for a central curator.

縮放 · Processing（編程語言） · 有偏 · MASS · 設計 ·

2022 年 1 月 28 日

Inclusion, equality and bias in designing online mass deliberative platforms

Ruth Shortall,Anatol Itten,Michiel van der Meer,Pradeep K. Murukannaiah,Catholijn M. Jonker

from arxiv, Updated paper following feedback

Designers of online deliberative platforms aim to counter the degrading quality of online debates and eliminate online discrimination based on class, race or gender. Support technologies such as machine learning and natural language processing open avenues for widening the circle of people involved in deliberation, moving from small groups to "crowd" scale. Some design features of large-scale online discussion systems allow larger numbers of people to discuss shared problems, enhance critical thinking, and formulate solutions. However, scaling up deliberation is challenging. We review the transdisciplinary literature on the design of digital mass-deliberation platforms and examine the commonly featured design aspects (e.g., argumentation support, automated facilitation, and gamification). We find that the literature is heavily focused on developing technical fixes for scaling up deliberation, with a heavy western influence on design and test users skew young and highly educated. Contrastingly, there is a distinct lack of discussion on the nature of the design process, the inclusion of stakeholders and issues relating to inclusion, which may unwittingly perpetuate bias. Another tendency of deliberation platforms is to nudge participants to desired forms of argumentation, and simplifying definitions of good and bad arguments to fit algorithmic purposes. Few studies bridge disciplines between deliberative theory, design and engineering. As a result, scaling up deliberation will likely advance in separate systemic siloes. We make design and process recommendations to correct this course and suggest avenues for future research.

高斯混合（模型） · 穩健性 · 異常點 · 期望極大算法 · 數據填補 ·

2022 年 1 月 28 日

A Robust and Flexible EM Algorithm for Mixtures of Elliptical Distributions with Missing Data

Florian Mouret,Alexandre Hippert-Ferrer,Frédéric Pascal,Jean-Yves Tourneret

This paper tackles the problem of missing data imputation for noisy and non-Gaussian data. A classical imputation method, the Expectation Maximization (EM) algorithm for Gaussian mixture models, has shown interesting properties when compared to other popular approaches such as those based on k-nearest neighbors or on multiple imputations by chained equations. However, Gaussian mixture models are known to be not robust to heterogeneous data, which can lead to poor estimation performance when the data is contaminated by outliers or come from a non-Gaussian distributions. To overcome this issue, a new expectation maximization algorithm is investigated for mixtures of elliptical distributions with the nice property of handling potential missing data. The complete-data likelihood associated with mixtures of elliptical distributions is well adapted to the EM framework thanks to its conditional distribution, which is shown to be a Student distribution. Experimental results on synthetic data demonstrate that the proposed algorithm is robust to outliers and can be used with non-Gaussian data. Furthermore, experiments conducted on real-world datasets show that this algorithm is very competitive when compared to other classical imputation methods.

估計/估計量 · Machine Learning · MoDELS · binary · 學成 ·

2022 年 1 月 27 日

Estimating Heterogeneous Treatment Effects for General Responses

Zijun Gao,Trevor Hastie

Heterogeneous treatment effect models allow us to compare treatments at subgroup and individual levels, and are of increasing popularity in applications like personalized medicine, advertising, and education. In this talk, we first survey different causal estimands used in practice, which focus on estimating the difference in conditional means. We then propose DINA, the difference in natural parameters, to quantify heterogeneous treatment effect in exponential families and the Cox model. For binary outcomes and survival times, DINA is both convenient and more practical for modeling the influence of covariates on the treatment effect. Second, we introduce a meta-algorithm for DINA, which allows practitioners to use powerful off-the-shelf machine learning tools for the estimation of nuisance functions, and which is also statistically robust to errors in inaccurate nuisance function estimation. We demonstrate the efficacy of our method combined with various machine learning base-learners on simulated and real datasets.

統計量 · Performer · 推斷 · 置信度 · 方差 ·

2022 年 1 月 27 日

Interpretation and inference for altmetric indicators arising from sparse data statistics

Lawrence Smolinsky,Bernhard Klingenberg,Brian D. Marx

from arxiv, To appear in the Journal of Informetrics

In 2018 Bornmann and Haunschild (2018a) introduced a new indicator called the Mantel-Haenszel quotient (MHq) to measure alternative metrics (or altmetrics) of scientometric data. In this article we review the Mantel-Haenszel statistics, point out two errors in the literature, and introduce a new indicator. First, we correct the interpretation of MHq and mention that it is still a meaningful indicator. Second, we correct the variance formula for MHq, which leads to narrower confidence intervals. A simulation study shows the superior performance of our variance estimator and confidence intervals. Since MHq does not match its original description in the literature, we propose a new indicator, the Mantel-Haenszel row risk ratio (MHRR), to meet that need. Interpretation and statistical inference for MHRR are discussed. For both MHRR and MHq, a value greater (less) than one means performance is better (worse) than in the reference set called the world.

統計量 · MoDELS · CASE · 相關系數 · Processing（編程語言） ·

2022 年 1 月 26 日

No evidence for an association between gender equality and pathogen prevalence -- a comment on Varnum and Grossmann 2017

Alexander Koplenig,Sascha Wolfer

In a previous study published in Nature Human Behaviour, Varnum and Grossmann claim that reductions in gender inequality are linked to reductions in pathogen prevalence in the United States between 1951 and 2013. Since the statistical methods used by Varnum and Grossmann are known to induce (seemingly) significant correlations between unrelated time series, so-called spurious or non-sense correlations, we test here whether the statistical association between gender inequality and pathogens prevalence in its current form also is the result of mis-specified models that do not correctly account for the temporal structure of the data. Our analysis clearly suggests that this is the case. We then discuss and apply several standard approaches of modelling time-series processes in the data and show that there is, at least as of now, no support for a statistical association between gender inequality and pathogen prevalence.

估計/估計量 · 推斷 · 可辨認的 · 潛在 · 數據集 ·

2022 年 1 月 26 日

Combining Experimental and Observational Data for Identification of Long-Term Causal Effects

AmirEmad Ghassami,Ilya Shpitser,Eric Tchetgen Tchetgen

We consider the task of estimating the causal effect of a treatment variable on a long-term outcome variable using data from an observational domain and an experimental domain. The observational data is assumed to be confounded and hence without further assumptions, this dataset alone cannot be used for causal inference. Also, only a short-term version of the primary outcome variable of interest is observed in the experimental data, and hence, this dataset alone cannot be used for causal inference either. In a recent work, Athey et al. (2020) proposed a method for systematically combining such data for identifying the downstream causal effect in view. Their approach is based on the assumptions of internal and external validity of the experimental data, and an extra novel assumption called latent unconfoundedness. In this paper, we first review their proposed approach and discuss the latent unconfoundedness assumption. Then we propose two alternative approaches for data fusion for the purpose of estimating average treatment effect as well as the effect of treatment on the treated. Our first proposed approach is based on assuming equi-confounding bias for the short-term and long-term outcomes. Our second proposed approach is based on the proximal causal inference framework, in which we assume the existence of an extra variable in the system which is a proxy of the latent confounder of the treatment-outcome relation.

CASES · 跡 · COVID-19 · INFORMS · Networking ·

2022 年 1 月 25 日

Network-Side Digital Contact Tracing on a Large University Campus

Matthew L. Malloy,Lance Hartung,Steve Wangen,Suman Banerjee

from arxiv, To appear, Mobile Computing and Networking (MobiCom 2022)

We describe a study conducted at a large public university campus in the United States which shows the efficacy of network log information for digital contact tracing and prediction of COVID-19 cases. Over the period of January 18, 2021 to May 7, 2021, more than 216 million client-access-point associations were logged across more than 11,000 wireless access points (APs). The association information was used to find potential contacts for approximately 30,000 individuals. Contacts are determined using an AP colocation algorithm, which supposes contact when two individuals connect to the same WiFi AP at approximately the same time. The approach was validated with a truth set of 350 positive COVID-19 cases inferred from the network log data by observing associations with APs in isolation residence halls reserved for individuals with a confirmed (clinical) positive COVID-19 test result. The network log data and AP-colocation have a predictive value of greater than 10%; more precisely, the contacts of an individual with a confirmed positive COVID-19 test have greater than a 10\% chance of testing positive in the following 7 days (compared with a 0.79% chance when chosen at random, a relative risk ratio of 12.6). Moreover, a cumulative exposure score is computed to account for exposure to multiple individuals that test positive. Over the duration of the study, the cumulative exposure score predicts positive cases with a true positive rate of 16.5% and missed detection rate of 79% at a specified operating point.

2021 年 6 月 8 日

We Know What You Want: An Advertising Strategy Recommender System for Online Advertising

Liyi Guo,Junqi Jin,Haoqi Zhang,Zhenzhe Zheng,Zhiye Yang,Zhizhuang Xing,Fei Pan,Lvyin Niu,Fan Wu,Haiyang Xu,Chuan Yu,Yuning Jiang,Xiaoqiang Zhu

from arxiv, Accepted by KDD 2021

Advertising expenditures have become the major source of revenue for e-commerce platforms. Providing good advertising experiences for advertisers through reducing their costs of trial and error for discovering the optimal advertising strategies is crucial for the long-term prosperity of online advertising. To achieve this goal, the advertising platform needs to identify the advertisers' marketing objectives, and then recommend the corresponding strategies to fulfill this objective. In this work, we first deploy a prototype of strategy recommender system on Taobao display advertising platform, recommending bid prices and targeted users to advertisers. We further augment this prototype system by directly revealing the advertising performance, and then infer the advertisers' marketing objectives through their adoptions of different recommending advertising performance. We use the techniques from context bandit to jointly learn the advertisers' marketing objectives and the recommending strategies. Online evaluations show that the designed advertising strategy recommender system can optimize the advertisers' advertising performance and increase the platform's revenue. Simulation experiments based on Taobao online bidding data show that the designed contextual bandit algorithm can effectively optimize the strategy adoption rate of advertisers.

情感分析 · INFORMS · Processing（編程語言） · MINE · Computational Linguistics ·

2018 年 1 月 23 日

SentiPers: A Sentiment Analysis Corpus for Persian

Pedram Hosseini,Ali Ahmadian Ramaki,Hassan Maleki,Mansoureh Anvari,Seyed Abolghasem Mirroshandel

Sentiment Analysis (SA) is a major field of study in natural language processing, computational linguistics and information retrieval. Interest in SA has been constantly growing in both academia and industry over the recent years. Moreover, there is an increasing need for generating appropriate resources and datasets in particular for low resource languages including Persian. These datasets play an important role in designing and developing appropriate opinion mining platforms using supervised, semi-supervised or unsupervised methods. In this paper, we outline the entire process of developing a manually annotated sentiment corpus, SentiPers, which covers formal and informal written contemporary Persian. To the best of our knowledge, SentiPers is a unique sentiment corpus with such a rich annotation in three different levels including document-level, sentence-level, and entity/aspect-level for Persian. The corpus contains more than 26000 sentences of users opinions from digital product domain and benefits from special characteristics such as quantifying the positiveness or negativity of an opinion through assigning a number within a specific range to any given sentence. Furthermore, we present statistics on various components of our corpus as well as studying the inter-annotator agreement among the annotators. Finally, some of the challenges that we faced during the annotation process will be discussed as well.