东京热加勒比中文无码_99久久精品无码一区二区可_国产欧美日韩黑人一区二区三区_一个人看的WWW免费视频中文_中文字幕丰满人妻无码_亚洲国产中文精品高清在线_欧美一级成年大片在线播放

A multivariate mixed-effects model seems to be the most appropriate for gene expression data collected in a crossover trial. It is, however, difficult to obtain reliable results using standard statistical inference when some responses are missing. Particularly for crossover studies, missingness is a serious concern as the trial requires a small number of participants. A Monte Carlo EM (MCEM)-based technique was adopted to deal with this situation. In addition to estimation, MCEM likelihood ratio tests (LRTs) are developed to test fixed effects in crossover models with missing data. Intensive simulation studies were conducted prior to analyzing gene expression data.

相關內容

缺失數據

關注 0

在統(tong)計調(diao)查(cha)的(de)(de)(de)過(guo)程中，由于受(shou)訪(fang)者對問(wen)題的(de)(de)(de)遺漏、拒絕，或是調(diao)查(cha)員與調(diao)查(cha)問(wen)卷本身存在的(de)(de)(de)一(yi)些疏忽，使得記錄經常會出(chu)現缺失數(shu)據(ju) (Missing Data) 的(de)(de)(de)問(wen)題。但是，幾乎(hu)所有(you)標準(zhun)統(tong)計方法都假設每個個案(an)具有(you)可用于分析的(de)(de)(de)所有(you)變量(liang)信息，因此缺失數(shu)據(ju)就成為進行統(tong)計研究(jiu)或問(wen)卷調(diao)查(cha)的(de)(de)(de)工作(zuo)人員所必須解決的(de)(de)(de)一(yi)個問(wen)題。

MoDELS · 可辨認的 · surge · INFORMS · Performer ·

2023 年 5 月 26 日

Calibration of Transformer-based Models for Identifying Stress and Depression in Social Media

Loukas Ilias,Spiros Mouzakitis,Dimitris Askounis

In today's fast-paced world, the rates of stress and depression present a surge. Social media provide assistance for the early detection of mental health conditions. Existing methods mainly introduce feature extraction approaches and train shallow machine learning classifiers. Other researches use deep neural networks or transformers. Despite the fact that transformer-based models achieve noticeable improvements, they cannot often capture rich factual knowledge. Although there have been proposed a number of studies aiming to enhance the pretrained transformer-based models with extra information or additional modalities, no prior work has exploited these modifications for detecting stress and depression through social media. In addition, although the reliability of a machine learning model's confidence in its predictions is critical for high-risk applications, there is no prior work taken into consideration the model calibration. To resolve the above issues, we present the first study in the task of depression and stress detection in social media, which injects extra linguistic information in transformer-based models, namely BERT and MentalBERT. Specifically, the proposed approach employs a Multimodal Adaptation Gate for creating the combined embeddings, which are given as input to a BERT (or MentalBERT) model. For taking into account the model calibration, we apply label smoothing. We test our proposed approaches in three publicly available datasets and demonstrate that the integration of linguistic features into transformer-based models presents a surge in the performance. Also, the usage of label smoothing contributes to both the improvement of the model's performance and the calibration of the model. We finally perform a linguistic analysis of the posts and show differences in language between stressful and non-stressful texts, as well as depressive and non-depressive posts.

秩 · 有偏 · MoDELS · Performer · 估計/估計量 ·

2023 年 5 月 26 日

Mitigating Exploitation Bias in Learning to Rank with an Uncertainty-aware Empirical Bayes Approach

Tao Yang,Cuize Han,Chen Luo,Parth Gupta,Jeff M. Phillips,Qingyao Ai

Ranking is at the core of many artificial intelligence (AI) applications, including search engines, recommender systems, etc. Modern ranking systems are often constructed with learning-to-rank (LTR) models built from user behavior signals. While previous studies have demonstrated the effectiveness of using user behavior signals (e.g., clicks) as both features and labels of LTR algorithms, we argue that existing LTR algorithms that indiscriminately treat behavior and non-behavior signals in input features could lead to suboptimal performance in practice. Particularly because user behavior signals often have strong correlations with the ranking objective and can only be collected on items that have already been shown to users, directly using behavior signals in LTR could create an exploitation bias that hurts the system performance in the long run. To address the exploitation bias, we propose EBRank, an empirical Bayes-based uncertainty-aware ranking algorithm. Specifically, to overcome exploitation bias brought by behavior features in ranking models, EBRank uses a sole non-behavior feature based prior model to get a prior estimation of relevance. In the dynamic training and serving of ranking systems, EBRank uses the observed user behaviors to update posterior relevance estimation instead of concatenating behaviors as features in ranking models. Besides, EBRank additionally applies an uncertainty-aware exploration strategy to explore actively, collect user behaviors for empirical Bayesian modeling and improve ranking performance. Experiments on three public datasets show that EBRank is effective, practical and significantly outperforms state-of-the-art ranking algorithms.

優化器 · Automator · state-of-the-art · CASES · 線性的 ·

2023 年 5 月 25 日

Metaheuristic planner for cooperative multi-agent wall construction with UAVs

Basel Elkhapery,Robert Pěni?ka,Michal Němec,Mohsin Siddiqui

This paper introduces a wall construction planner for Unmanned Aerial Vehicles (UAVs), which uses a Greedy Randomized Adaptive Search Procedure (GRASP) metaheuristic to generate near-time-optimal building plans for even large walls within seconds. This approach addresses one of the most time-consuming and labor-intensive tasks, while also minimizing workers' safety risks. To achieve this, the wall-building problem is modeled as a variant of the Team Orienteering Problem and is formulated as Mixed-Integer Linear Programming (MILP), with added precedence and concurrence constraints that ensure bricks are built in the correct order and without collision between cooperating agents. The GRASP planner is validated in a realistic simulation and demonstrated to find solutions with similar quality as the optimal MILP, but much faster. Moreover, it outperforms all other state-of-the-art planning approaches in the majority of test cases. This paper presents a significant advancement in the field of automated wall construction, demonstrating the potential of UAVs and optimization algorithms in improving the efficiency and safety of construction projects.

Facebook AI Research · 頻率主義學派 · 估計/估計量 · 有偏 · MoDELS ·

2023 年 5 月 25 日

Monitoring Algorithmic Fairness

Thomas A. Henzinger,Mahyar Karimi,Konstantin Kueffner,Kaushik Mallik

from arxiv, CAV 2023

Machine-learned systems are in widespread use for making decisions about humans, and it is important that they are fair, i.e., not biased against individuals based on sensitive attributes. We present runtime verification of algorithmic fairness for systems whose models are unknown, but are assumed to have a Markov chain structure. We introduce a specification language that can model many common algorithmic fairness properties, such as demographic parity, equal opportunity, and social burden. We build monitors that observe a long sequence of events as generated by a given system, and output, after each observation, a quantitative estimate of how fair or biased the system was on that run until that point in time. The estimate is proven to be correct modulo a variable error bound and a given confidence level, where the error bound gets tighter as the observed sequence gets longer. Our monitors are of two types, and use, respectively, frequentist and Bayesian statistical inference techniques. While the frequentist monitors compute estimates that are objectively correct with respect to the ground truth, the Bayesian monitors compute estimates that are correct subject to a given prior belief about the system's model. Using a prototype implementation, we show how we can monitor if a bank is fair in giving loans to applicants from different social backgrounds, and if a college is fair in admitting students while maintaining a reasonable financial burden on the society. Although they exhibit different theoretical complexities in certain cases, in our experiments, both frequentist and Bayesian monitors took less than a millisecond to update their verdicts after each observation.

統計量 · MoDELS · 推斷 · 穩健性 · 似然 ·

2023 年 5 月 25 日

Learning Robust Statistics for Simulation-based Inference under Model Misspecification

Daolang Huang,Ayush Bharti,Amauri Souza,Luigi Acerbi,Samuel Kaski

Simulation-based inference (SBI) methods such as approximate Bayesian computation (ABC), synthetic likelihood, and neural posterior estimation (NPE) rely on simulating statistics to infer parameters of intractable likelihood models. However, such methods are known to yield untrustworthy and misleading inference outcomes under model misspecification, thus hindering their widespread applicability. In this work, we propose the first general approach to handle model misspecification that works across different classes of SBI methods. Leveraging the fact that the choice of statistics determines the degree of misspecification in SBI, we introduce a regularized loss function that penalises those statistics that increase the mismatch between the data and the model. Taking NPE and ABC as use cases, we demonstrate the superior performance of our method on high-dimensional time-series models that are artificially misspecified. We also apply our method to real data from the field of radio propagation where the model is known to be misspecified. We show empirically that the method yields robust inference in misspecified scenarios, whilst still being accurate when the model is well-specified.

穩健性 · 模型評估 · 決策樹 · Learning · 可約的 ·

2023 年 5 月 24 日

Differentially-Private Decision Trees with Probabilistic Robustness to Data Poisoning

Dani?l Vos,Jelle Vos,Tianyu Li,Zekeriya Erkin,Sicco Verwer

Decision trees are interpretable models that are well-suited to non-linear learning problems. Much work has been done on extending decision tree learning algorithms with differential privacy, a system that guarantees the privacy of samples within the training data. However, current state-of-the-art algorithms for this purpose sacrifice much utility for a small privacy benefit. These solutions create random decision nodes that reduce decision tree accuracy or spend an excessive share of the privacy budget on labeling leaves. Moreover, many works do not support or leak information about feature values when data is continuous. We propose a new method called PrivaTree based on private histograms that chooses good splits while consuming a small privacy budget. The resulting trees provide a significantly better privacy-utility trade-off and accept mixed numerical and categorical data without leaking additional information. Finally, while it is notoriously hard to give robustness guarantees against data poisoning attacks, we prove bounds for the expected success rates of backdoor attacks against differentially-private learners. Our experimental results show that PrivaTree consistently outperforms previous works on predictive accuracy and significantly improves robustness against backdoor attacks compared to regular decision trees.

向量空間 · 推斷 · 黑盒 · 縮放 · 蒙特卡羅 ·

2023 年 5 月 24 日

Black-Box Variational Inference Converges

Kyurae Kim,Kaiwen Wu,Jisu Oh,Yian Ma,Jacob R. Gardner

from arxiv, under review

We provide the first convergence guarantee for full black-box variational inference (BBVI), also known as Monte Carlo variational inference. While preliminary investigations worked on simplified versions of BBVI (e.g., bounded domain, bounded support, only optimizing for the scale, and such), our setup does not need any such algorithmic modifications. Our results hold for log-smooth posterior densities with and without strong log-concavity and the location-scale variational family. Also, our analysis reveals that certain algorithm design choices commonly employed in practice, particularly, nonlinear parameterizations of the scale of the variational approximation, can result in suboptimal convergence rates. Fortunately, running BBVI with proximal stochastic gradient descent fixes these limitations, and thus achieves the strongest known convergence rate guarantees. We evaluate this theoretical insight by comparing proximal SGD against other standard implementations of BBVI on large-scale Bayesian inference problems.

估計/估計量 · 線性回歸 · 線性的 · 異方差 · 均值 ·

2023 年 5 月 24 日

On estimators of the mean of infinite dimensional data in finite populations

Anurag Dey,Probal Chaudhuri

The Horvitz-Thompson (HT), the Rao-Hartley-Cochran (RHC) and the generalized regression (GREG) estimators of the finite population mean are considered, when the observations are from an infinite dimensional space. We compare these estimators based on their asymptotic distributions under some commonly used sampling designs and some superpopulations satisfying linear regression models. We show that the GREG estimator is asymptotically at least as efficient as any of the other two estimators under different sampling designs considered in this paper. Further, we show that the use of some well known sampling designs utilizing auxiliary information may have an adverse effect on the performance of the GREG estimator, when the degree of heteroscedasticity present in linear regression models is not very large. On the other hand, the use of those sampling designs improves the performance of this estimator, when the degree of heteroscedasticity present in linear regression models is large. We develop methods for determining the degree of heteroscedasticity, which in turn determines the choice of appropriate sampling design to be used with the GREG estimator. We also investigate the consistency of the covariance operators of the above estimators. We carry out some numerical studies using real and synthetic data, and our theoretical results are supported by the results obtained from those numerical studies.

優化器 · Learning · 推薦系統 · 強化學習 · 總回報 ·

2023 年 5 月 24 日

Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning

Ruiyang Xu,Jalaj Bhandari,Dmytro Korenkevych,Fan Liu,Yuchen He,Alex Nikulkov,Zheqing Zhu

Auction-based recommender systems are prevalent in online advertising platforms, but they are typically optimized to allocate recommendation slots based on immediate expected return metrics, neglecting the downstream effects of recommendations on user behavior. In this study, we employ reinforcement learning to optimize for long-term return metrics in an auction-based recommender system. Utilizing temporal difference learning, a fundamental reinforcement learning algorithm, we implement an one-step policy improvement approach that biases the system towards recommendations with higher long-term user engagement metrics. This optimizes value over long horizons while maintaining compatibility with the auction framework. Our approach is grounded in dynamic programming ideas which show that our method provably improves upon the existing auction-based base policy. Through an online A/B test conducted on an auction-based recommender system which handles billions of impressions and users daily, we empirically establish that our proposed method outperforms the current production system in terms of long-term user engagement metrics.

遷移學習 · 學成 · Performer · 目標領域 · MoDELS ·

2019 年 11 月 7 日

A Comprehensive Survey on Transfer Learning

Fuzhen Zhuang,Zhiyuan Qi,Keyu Duan,Dongbo Xi,Yongchun Zhu,Hengshu Zhu,Hui Xiong,Qing He

from arxiv, 27 pages, 6 figures

Transfer learning aims at improving the performance of target learners on target domains by transferring the knowledge contained in different but related source domains. In this way, the dependence on a large number of target domain data can be reduced for constructing target learners. Due to the wide application prospects, transfer learning has become a popular and promising area in machine learning. Although there are already some valuable and impressive surveys on transfer learning, these surveys introduce approaches in a relatively isolated way and lack the recent advances in transfer learning. As the rapid expansion of the transfer learning area, it is both necessary and challenging to comprehensively review the relevant studies. This survey attempts to connect and systematize the existing transfer learning researches, as well as to summarize and interpret the mechanisms and the strategies in a comprehensive way, which may help readers have a better understanding of the current research status and ideas. Different from previous surveys, this survey paper reviews over forty representative transfer learning approaches from the perspectives of data and model. The applications of transfer learning are also briefly introduced. In order to show the performance of different transfer learning models, twenty representative transfer learning models are used for experiments. The models are performed on three different datasets, i.e., Amazon Reviews, Reuters-21578, and Office-31. And the experimental results demonstrate the importance of selecting appropriate transfer learning models for different applications in practice.