青柠在线观看免费高清1,动漫AV观看网站不卡无码

Pretrial risk assessment tools are used in jurisdictions across the country to assess the likelihood of "pretrial failure," the event where defendants either fail to appear for court or reoffend. Judicial officers, in turn, use these assessments to determine whether to release or detain defendants during trial. While algorithmic risk assessment tools were designed to predict pretrial failure with greater accuracy relative to judges, there is still concern that both risk assessment recommendations and pretrial decisions are biased against minority groups. In this paper, we develop methods to investigate the association between risk factors and pretrial failure, while simultaneously estimating misclassification rates of pretrial risk assessments and of judicial decisions as a function of defendant race. This approach adds to a growing literature that makes use of outcome misclassification methods to answer questions about fairness in pretrial decision-making. We give a detailed simulation study for our proposed methodology and apply these methods to data from the Virginia Department of Criminal Justice Services. We estimate that the VPRAI algorithm has near-perfect specificity, but its sensitivity differs by defendant race. Judicial decisions also display evidence of bias; we estimate wrongful detention rates of 39.7% and 51.4% among white and Black defendants, respectively.

相關內容

估計/估計量

關注 3

xgboost · Machine Learning · 模型評估 · LightGBM · 閾值 ·

2024 年 7 月 5 日

Predicting the duration of traffic incidents for Sydney greater metropolitan area using machine learning methods

Artur Grigorev,Sajjad Shafiei,Hanna Grzybowska,Adriana-Simona Mihaita

This research presents a comprehensive approach to predicting the duration of traffic incidents and classifying them as short-term or long-term across the Sydney Metropolitan Area. Leveraging a dataset that encompasses detailed records of traffic incidents, road network characteristics, and socio-economic indicators, we train and evaluate a variety of advanced machine learning models including Gradient Boosted Decision Trees (GBDT), Random Forest, LightGBM, and XGBoost. The models are assessed using Root Mean Square Error (RMSE) for regression tasks and F1 score for classification tasks. Our experimental results demonstrate that XGBoost and LightGBM outperform conventional models with XGBoost achieving the lowest RMSE of 33.7 for predicting incident duration and highest classification F1 score of 0.62 for a 30-minute duration threshold. For classification, the 30-minute threshold balances performance with 70.84% short-term duration classification accuracy and 62.72% long-term duration classification accuracy. Feature importance analysis, employing both tree split counts and SHAP values, identifies the number of affected lanes, traffic volume, and types of primary and secondary vehicles as the most influential features. The proposed methodology not only achieves high predictive accuracy but also provides stakeholders with vital insights into factors contributing to incident durations. These insights enable more informed decision-making for traffic management and response strategies. The code is available by the link: //github.com/Future-Mobility-Lab/SydneyIncidents

MoDELS · 有向 · 回合 · Processing（編程語言） · 條件獨立的 ·

2024 年 7 月 3 日

Spatial modeling of extremes and an angular component

Gaspard Tamagny,Mathieu Ribatet

from arxiv, 14 pages, 8 figures

Many environmental processes such as rainfall, wind or snowfall are inherently spatial and the modelling of extremes has to take into account that feature. In addition, environmental processes are often attached with an angle, e.g., wind speed and direction or extreme snowfall and time of occurrence in year. This article proposes a Bayesian hierarchical model with a conditional independence assumption that aims at modelling simultaneously spatial extremes and an angular component. The proposed model relies on the extreme value theory as well as recent developments for handling directional statistics over a continuous domain. Working within a Bayesian setting, a Gibbs sampler is introduced whose performances are analysed through a simulation study. The paper ends with an application on extreme wind speed in France. Results show that extreme wind events in France are mainly coming from West apart from the Mediterranean part of France and the Alps.

變換 · Graph Transformer · 圖 · Analysis · MoDELS ·

2024 年 7 月 3 日

A higher-order transformation approach to the formalization and analysis of BPMN using graph transformation systems

Tim Kr?uter,Adrian Rutle,Harald K?nig,Yngve Lamo

The Business Process Modeling Notation (BPMN) is a widely used standard notation for defining intra- and inter-organizational workflows. However, the informal description of the BPMN execution semantics leads to different interpretations of BPMN elements and difficulties in checking behavioral properties. In this article, we propose a formalization of the execution semantics of BPMN that, compared to existing approaches, covers more BPMN elements while also facilitating property checking. Our approach is based on a higher-order transformation from BPMN models to graph transformation systems. To show the capabilities of our approach, we implemented it as an open-source web-based tool.

閾值 · Automator · 推斷 · 估計/估計量 · MoDELS ·

2024 年 7 月 3 日

Automated threshold selection and associated inference uncertainty for univariate extremes

Conor Murphy,Jonathan A. Tawn,Zak Varty

Threshold selection is a fundamental problem in any threshold-based extreme value analysis. While models are asymptotically motivated, selecting an appropriate threshold for finite samples is difficult and highly subjective through standard methods. Inference for high quantiles can also be highly sensitive to the choice of threshold. Too low a threshold choice leads to bias in the fit of the extreme value model, while too high a choice leads to unnecessary additional uncertainty in the estimation of model parameters. We develop a novel methodology for automated threshold selection that directly tackles this bias-variance trade-off. We also develop a method to account for the uncertainty in the threshold estimation and propagate this uncertainty through to high quantile inference. Through a simulation study, we demonstrate the effectiveness of our method for threshold selection and subsequent extreme quantile estimation, relative to the leading existing methods, and show how the method's effectiveness is not sensitive to the tuning parameters. We apply our method to the well-known, troublesome example of the River Nidd dataset.

MoDELS · 圖 · 估計/估計量 · 可理解性 · 歐氏空間 ·

2024 年 7 月 2 日

A flexible and interpretable spatial covariance model for data on graphs

Michael F. Christensen,Peter D. Hoff

from arxiv, 36 pages, 7 figures

Spatial models for areal data are often constructed such that all pairs of adjacent regions are assumed to have near-identical spatial autocorrelation. In practice, data can exhibit dependence structures more complicated than can be represented under this assumption. In this article we develop a new model for spatially correlated data observed on graphs, which can flexibly represented many types of spatial dependence patterns while retaining aspects of the original graph geometry. Our method implies an embedding of the graph into Euclidean space wherein covariance can be modeled using traditional covariance functions, such as those from the Mat\'{e}rn family. We parameterize our model using a class of graph metrics compatible with such covariance functions, and which characterize distance in terms of network flow, a property useful for understanding proximity in many ecological settings. By estimating the parameters underlying these metrics, we recover the "intrinsic distances" between graph nodes, which assist in the interpretation of the estimated covariance and allow us to better understand the relationship between the observed process and spatial domain. We compare our model to existing methods for spatially dependent graph data, primarily conditional autoregressive models and their variants, and illustrate advantages of our method over traditional approaches. We fit our model to bird abundance data for several species in North Carolina, and show how it provides insight into the interactions between species-specific spatial distributions and geography.

廣義線性模型 · 線性的 · 線性模型 · 代價 · MoDELS ·

2024 年 7 月 2 日

Scalable expectation propagation for generalized linear models

Niccolò Anceschi,Augusto Fasano,Beatrice Franzolini,Giovanni Rebaudo

Generalized linear models (GLMs) arguably represent the standard approach for statistical regression beyond the Gaussian likelihood scenario. When Bayesian formulations are employed, the general absence of a tractable posterior distribution has motivated the development of deterministic approximations, which are generally more scalable than sampling techniques. Among them, expectation propagation (EP) showed extreme accuracy, usually higher than many variational Bayes solutions. However, the higher computational cost of EP posed concerns about its practical feasibility, especially in high-dimensional settings. We address these concerns by deriving a novel efficient formulation of EP for GLMs, whose cost scales linearly in the number of covariates p. This reduces the state-of-the-art O(p^2 n) per-iteration computational cost of the EP routine for GLMs to O(p n min{p,n}), with n being the sample size. We also show that, for binary models and log-linear GLMs approximate predictive means can be obtained at no additional cost. To preserve efficient moment matching for count data, we propose employing a combination of log-normal Laplace transform approximations, avoiding numerical integration. These novel results open the possibility of employing EP in settings that were believed to be practically impossible. Improvements over state-of-the-art approaches are illustrated both for simulated and real data. The efficient EP implementation is available at //github.com/niccoloanceschi/EPglm.

有限差分 · 模型評估 · Analysis · 數值分析 ·

2024 年 7 月 2 日

Consistency and stability of boundary conditions for a two-velocities lattice Boltzmann scheme

Thomas Bellotti

We explore theoretical aspects of boundary conditions for lattice Boltzmann methods, focusing on a toy two-velocities scheme. By mapping lattice Boltzmann schemes to Finite Difference schemes, we facilitate rigorous consistency and stability analyses. We develop kinetic boundary conditions for inflows and outflows, highlighting the trade-off between accuracy and stability, which we successfully overcome. Stability is assessed using GKS (Gustafsson, Kreiss, and Sundstr{\"o}m) analysis and -- when this approach fails on coarse meshes -- spectral and pseudo-spectral analyses of the scheme's matrix that explain effects germane to low resolutions.

contrastive · Processing（編程語言） · 估計/估計量 · 極小點 · Performer ·

2024 年 7 月 2 日

On minimum contrast method for multivariate spatial point processes

Lin Zhu,Junho Yang,Mikyoung Jun,Scott Cook

Compared to widely used likelihood-based approaches, the minimum contrast (MC) method offers a computationally efficient method for estimation and inference of spatial point processes. These relative gains in computing time become more pronounced when analyzing complicated multivariate point process models. Despite this, there has been little exploration of the MC method for multivariate spatial point processes. Therefore, this article introduces a new MC method for parametric multivariate spatial point processes. A contrast function is computed based on the trace of the power of the difference between the conjectured $K$-function matrix and its nonparametric unbiased edge-corrected estimator. Under standard assumptions, we derive the asymptotic normality of our MC estimator. The performance of the proposed method is demonstrated through simulation studies of bivariate log-Gaussian Cox processes and five-variate product-shot-noise Cox processes.

語言模型化 · 大語言模型 · MoDELS · Agent · 相互獨立的 ·

2024 年 6 月 29 日

Answering real-world clinical questions using large language model based systems

Yen Sia Low,Michael L. Jackson,Rebecca J. Hyde,Robert E. Brown,Neil M. Sanghavi,Julian D. Baldwin,C. William Pike,Jananee Muralidharan,Gavin Hui,Natasha Alexander,Hadeel Hassan,Rahul V. Nene,Morgan Pike,Courtney J. Pokrzywa,Shivam Vedak,Adam Paul Yan,Dong-han Yao,Amy R. Zipursky,Christina Dinh,Philip Ballentine,Dan C. Derieg,Vladimir Polony,Rehan N. Chawdry,Jordan Davies,Brigham B. Hyde,Nigam H. Shah,Saurabh Gombar

from arxiv, 28 pages (2 figures, 3 tables) inclusive of 8 pages of supplemental materials (4 supplemental figures and 4 supplemental tables)

Evidence to guide healthcare decisions is often limited by a lack of relevant and trustworthy literature as well as difficulty in contextualizing existing research for a specific patient. Large language models (LLMs) could potentially address both challenges by either summarizing published literature or generating new studies based on real-world data (RWD). We evaluated the ability of five LLM-based systems in answering 50 clinical questions and had nine independent physicians review the responses for relevance, reliability, and actionability. As it stands, general-purpose LLMs (ChatGPT-4, Claude 3 Opus, Gemini Pro 1.5) rarely produced answers that were deemed relevant and evidence-based (2% - 10%). In contrast, retrieval augmented generation (RAG)-based and agentic LLM systems produced relevant and evidence-based answers for 24% (OpenEvidence) to 58% (ChatRWD) of questions. Only the agentic ChatRWD was able to answer novel questions compared to other LLMs (65% vs. 0-9%). These results suggest that while general-purpose LLMs should not be used as-is, a purpose-built system for evidence summarization based on RAG and one for generating novel evidence working synergistically would improve availability of pertinent evidence for patient care.

Engineering · 可理解性 · motivation · 可辨認的 · 可約的 ·

2024 年 6 月 29 日

Please do not go: understanding turnover of software engineers from different perspectives

Michelle Larissa Luciano Carvalho,Paulo da Silva Cruz,Eduardo Santana de Almeida,Paulo Anselmo da Mota Silveira Neto,Rafael Prikladnicki

Turnover consists of moving into and out of professional employees in the company in a given period. Such a phenomenon significantly impacts the software industry since it generates knowledge loss, delays in the schedule, and increased costs in the final project. Despite the efforts made by researchers and professionals to minimize the turnover, more studies are needed to understand the motivation that drives Software Engineers to leave their jobs and the main strategies CEOs adopt to retain these professionals in software development companies. In this paper, we contribute a mixed methods study involving semi-structured interviews with Software Engineers and CEOs to obtain a wider opinion of these professionals about turnover and a subsequent validation survey with additional software engineers to check and review the insights from interviews. In studying such aspects, we identified 19 different reasons for software engineers' turnover and 18 more efficient strategies used in the software development industry to reduce it. Our findings provide several implications for industry and academia, which can drive future research.