久久久久久久精品少妇9999,99国产精品久久久久99打野战,欧洲美女精品久久久久久久,乱人伦中文视频在线2020,亚洲国产精品无码久久55555

Item response theory (IRT) is the statistical paradigm underlying a dominant family of generative probabilistic models for test responses, used to quantify traits in individuals relative to target populations. The graded response model (GRM) is a particular IRT model that is used for ordered polytomous test responses. Both the development and the application of the GRM and other IRT models require statistical decisions. For formulating these models (calibration), one needs to decide on methodologies for item selection, inference, and regularization. For applying these models (test scoring), one needs to make similar decisions, often prioritizing computational tractability and/or interpretability. In many applications, such as in the Work Disability Functional Assessment Battery (WD-FAB), tractability implies approximating an individual's score distribution using estimates of mean and variance, and obtaining that score conditional on only point estimates of the calibrated model. In this manuscript, we evaluate the calibration and scoring of models under this common use-case using Bayesian cross-validation. Applied to the WD-FAB responses collected for the National Institutes of Health, we assess the predictive power of implementations of the GRM based on their ability to yield, on validation sets of respondents, ability estimates that are most predictive of patterns of item responses. Our main finding indicates that regularized Bayesian calibration of the GRM outperforms the regularization-free empirical Bayesian procedure of marginal maximum likelihood. We also motivate the use of compactly supported priors in test scoring.

相關內容

極大似然

關注 0

得分 · 秩 · 優化器 · ONCE · Performer ·

2022 年 2 月 4 日

How should we score athletes and candidates: geometric scoring rules

Aleksei Y. Kondratev,Egor Ianovski,Alexander S. Nesterov

from arxiv, 40 pages, 3 figures, 5 tables

Scoring rules are widely used to rank athletes in sports and candidates in elections. Each position in each individual ranking is worth a certain number of points; the total sum of points determines the aggregate ranking. The question is how to choose a scoring rule for a specific application. First, we derive a one-parameter family with geometric scores which satisfies two principles of independence: once an extremely strong or weak candidate is removed, the aggregate ranking ought to remain intact. This family includes Borda count, generalised plurality (medal count), and generalised antiplurality (threshold rule) as edge cases, and we find which additional axioms characterise these rules. Second, we introduce a one-parameter family with optimal scores: the athletes should be ranked according to their expected overall quality. Finally, using historical data from biathlon, golf, and running we demonstrate how the geometric and optimal scores can simplify the selection of suitable scoring rules, show that these scores closely resemble the actual scores used by the organisers, and provide an explanation for empirical phenomena observed in golf tournaments. We see that geometric scores approximate the optimal scores well in events where the distribution of athletes' performances is roughly uniform.

線性的 · Laplace分布 · 線性回歸 · Performer · 線性模型 ·

2022 年 2 月 4 日

A Comparison Between Quantile Regression and Linear Regression on Empirical Quantiles for Phenological Analysis in Migratory Response to Climate Change

M?ns Karlsson,Ola H?ssjer

It is well established that migratory birds in general have advanced their arrival times in spring, and in this paper we investigate potential ways of enhancing the level of detail in future phenological analyses. We perform single as well as multiple species analyses, using linear models on empirical quantiles, non-parametric quantile regression and likelihood-based parametric quantile regression with asymmetric Laplace distributed error terms. We conclude that non-parametric quantile regression appears most suited for single as well as multiple species analyses.

貝葉斯推斷 · MoDELS · 推斷 · 估計/估計量 · COVID-19 ·

2022 年 2 月 4 日

Population Calibration using Likelihood-Free Bayesian Inference

Christopher Drovandi,Brodie Lawson,Adrianne L Jenner,Alexander P Browning

from arxiv, 20 pages, 6 figures

In this paper we develop a likelihood-free approach for population calibration, which involves finding distributions of model parameters when fed through the model produces a set of outputs that matches available population data. Unlike most other approaches to population calibration, our method produces uncertainty quantification on the estimated distribution. Furthermore, the method can be applied to any population calibration problem, regardless of whether the model of interest is deterministic or stochastic, or whether the population data is observed with or without measurement error. We demonstrate the method on several examples, including one with real data. We also discuss the computational limitations of the approach. Immediate applications for the methodology developed here exist in many areas of medical research including cancer, COVID-19, drug development and cardiology.

Facebook AI Research · MoDELS · 模型性能 · 閾值 · Performer ·

2022 年 2 月 3 日

Net benefit, calibration, threshold selection, and training objectives for algorithmic fairness in healthcare

Stephen R. Pfohl,Yizhe Xu,Agata Foryciarz,Nikolaos Ignatiadis,Julian Genkins,Nigam H. Shah

A growing body of work uses the paradigm of algorithmic fairness to frame the development of techniques to anticipate and proactively mitigate the introduction or exacerbation of health inequities that may follow from the use of model-guided decision-making. We evaluate the interplay between measures of model performance, fairness, and the expected utility of decision-making to offer practical recommendations for the operationalization of algorithmic fairness principles for the development and evaluation of predictive models in healthcare. We conduct an empirical case-study via development of models to estimate the ten-year risk of atherosclerotic cardiovascular disease to inform statin initiation in accordance with clinical practice guidelines. We demonstrate that approaches that incorporate fairness considerations into the model training objective typically do not improve model performance or confer greater net benefit for any of the studied patient populations compared to the use of standard learning paradigms followed by threshold selection concordant with patient preferences, evidence of intervention effectiveness, and model calibration. These results hold when the measured outcomes are not subject to differential measurement error across patient populations and threshold selection is unconstrained, regardless of whether differences in model performance metrics, such as in true and false positive error rates, are present. In closing, we argue for focusing model development efforts on developing calibrated models that predict outcomes well for all patient populations while emphasizing that such efforts are complementary to transparent reporting, participatory design, and reasoning about the impact of model-informed interventions in context.

優化器 · 黑盒 · CASE · 目標函數 · 泛函 ·

2022 年 2 月 3 日

A unified surrogate-based scheme for black-box and preference-based optimization

Davide Previtali,Mirko Mazzoleni,Antonio Ferramosca,Fabio Previdi

from arxiv, 17 pages, 2 figures and 1 table. arXiv admin note: text overlap with arXiv:2202.01125

Black-box and preference-based optimization algorithms are global optimization procedures that aim to find the global solutions of an optimization problem using, respectively, the least amount of function evaluations or sample comparisons as possible. In the black-box case, the analytical expression of the objective function is unknown and it can only be evaluated through a (costly) computer simulation or an experiment. In the preference-based case, the objective function is still unknown but it corresponds to the subjective criterion of an individual. So, it is not possible to quantify such criterion in a reliable and consistent way. Therefore, preference-based optimization algorithms seek global solutions using only comparisons between couples of different samples, for which a human decision-maker indicates which of the two is preferred. Quite often, the black-box and preference-based frameworks are covered separately and are handled using different techniques. In this paper, we show that black-box and preference-based optimization problems are closely related and can be solved using the same family of approaches, namely surrogate-based methods. Moreover, we propose the generalized Metric Response Surface (gMRS) algorithm, an optimization scheme that is a generalization of the popular MSRS framework. Finally, we provide a convergence proof for the proposed optimization method.

估計/估計量 · 馬爾可夫鏈 · Better · 學成 · 統計量 ·

2022 年 2 月 2 日

Gradient estimators for normalising flows

Piotr Bialas,Piotr Korcyl,Tomasz Stebel

from arxiv, 19 pages, 5 figures

Recently a machine learning approach to Monte-Carlo simulations called Neural Markov Chain Monte-Carlo (NMCMC) is gaining traction. In its most popular form it uses the neural networks to construct normalizing flows which are then trained to approximate the desired target distribution. As this distribution is usually defined via a Hamiltonian or action, the standard learning algorithm requires estimation of the action gradient with respect to the fields. In this contribution we present another gradient estimator (and the corresponding [PyTorch implementation) that avoids this calculation, thus potentially speeding up training for models with more complicated actions. We also study the statistical properties of several gradient estimators and show that our formulation leads to better training results.

相關系數 · 估計/估計量 · 規范化的 · 穩健性 · 統計量 ·

2022 年 2 月 2 日

Robust approach for comparing two dependent normal populations through Wald-type tests based on Rényi's pseudodistance estimators

Elena Castilla,María Jaenada,Nirian Martín,Leandro Pardo

Since the two seminal papers by Fisher (1915, 1921) were published, the test under a fixed value correlation coefficient null hypothesis for the bivariate normal distribution constitutes an important statistical problem. In the framework of asymptotic robust statistics, it remains being a topic of great interest to be investigated. For this and other tests, focused on paired correlated normal random samples, R\'enyi's pseudodistance estimators are proposed, their asymptotic distribution is established and an iterative algorithm is provided for their computation. From them the Wald-type test statistics are constructed for different problems of interest and their influence function is theoretically studied. For testing null correlation in different contexts, an extensive simulation study and two real data based examples support the robust properties of our proposal.

COVID-19 · 控制器 · MoDELS · 可約的 · Perplexity ·

2022 年 2 月 2 日

Designing Social Distancing Policies for the COVID-19 Pandemic: A probabilistic model predictive control approach

Antonis Armaou,Bryce Katch,Lucia Russo,Constantinos Siettos

The effective control of the COVID-19 pandemic is one the most challenging issues of nowadays. The design of optimal control policies is perplexed from a variety of social, political, economical and epidemiological factors. Here, based on epidemiological data reported in recent studies for the Italian region of Lombardy, which experienced one of the largest and most devastating outbreaks in Europe during the first wave of the pandemic, we address a probabilistic model predictive control (PMPC) approach for the modelling and the systematic study of what if scenarios of the social distancing in a retrospective analysis for the first wave of the pandemic in Lombardy. The performance of the proposed PMPC scheme was assessed based on simulations of a compartmental model that was developed to quantify the uncertainty in the level of the asymptomatic cases in the population, and the synergistic effect of social distancing in various activities, and public awareness campaign prompting people to adopt cautious behaviors to reduce the risk of disease transmission. The PMPC scheme takes into account the social mixing effect, i.e. the effect of the various activities in the potential transmission of the disease. The proposed approach demonstrates the utility of a PMPC approach in addressing COVID-19 transmission and implementing public relaxation policies.

Performer · 模型性能 · MoDELS · Extensibility · 可約的 ·

2022 年 2 月 1 日

A comparison of approaches to improve worst-case predictive model performance over patient subpopulations

Stephen R. Pfohl,Haoran Zhang,Yizhe Xu,Agata Foryciarz,Marzyeh Ghassemi,Nigam H. Shah

Predictive models for clinical outcomes that are accurate on average in a patient population may underperform drastically for some subpopulations, potentially introducing or reinforcing inequities in care access and quality. Model training approaches that aim to maximize worst-case model performance across subpopulations, such as distributionally robust optimization (DRO), attempt to address this problem without introducing additional harms. We conduct a large-scale empirical study of DRO and several variations of standard learning procedures to identify approaches for model development and selection that consistently improve disaggregated and worst-case performance over subpopulations compared to standard approaches for learning predictive models from electronic health records data. In the course of our evaluation, we introduce an extension to DRO approaches that allows for specification of the metric used to assess worst-case performance. We conduct the analysis for models that predict in-hospital mortality, prolonged length of stay, and 30-day readmission for inpatient admissions, and predict in-hospital mortality using intensive care data. We find that, with relatively few exceptions, no approach performs better, for each patient subpopulation examined, than standard learning procedures using the entire training dataset. These results imply that when it is of interest to improve model performance for patient subpopulations beyond what can be achieved with standard practices, it may be necessary to do so via data collection techniques that increase the effective sample size or reduce the level of noise in the prediction problem.

似然 · 估計/估計量 · 最大似然估計 · 極大似然 · MoDELS ·

2018 年 9 月 24 日

Implicit Maximum Likelihood Estimation

Ke Li,Jitendra Malik

from arxiv, 21 pages, 4 figures. In the interest of promoting discussion, we make the reviews available at //people.eecs.berkeley.edu/~ke.li/papers/imle_reviews.pdf

Implicit probabilistic models are models defined naturally in terms of a sampling procedure and often induces a likelihood function that cannot be expressed explicitly. We develop a simple method for estimating parameters in implicit models that does not require knowledge of the form of the likelihood function or any derived quantities, but can be shown to be equivalent to maximizing likelihood under some conditions. Our result holds in the non-asymptotic parametric setting, where both the capacity of the model and the number of data examples are finite. We also demonstrate encouraging experimental results.