青柠在线观看免费高清1_国内精品VA视频在线观看_国产在线播放你懂的网站_久久久无码人妻一区二区三区少妇_免费六级A一片久久精品网_尹人香蕉久久99天天拍欧_欧美激情精品视频一区二区二区

Background: Cognitive biases in clinical decision-making significantly contribute to errors in diagnosis and suboptimal patient outcomes. Addressing these biases presents a formidable challenge in the medical field. Objective: This study explores the role of large language models (LLMs) in mitigating these biases through the utilization of a multi-agent framework. We simulate the clinical decision-making processes through multi-agent conversation and evaluate its efficacy in improving diagnostic accuracy. Methods: A total of 16 published and unpublished case reports where cognitive biases have resulted in misdiagnoses were identified from the literature. In the multi-agent framework, we leveraged GPT-4 to facilitate interactions among four simulated agents to replicate clinical team dynamics. Each agent has a distinct role: 1) To make the final diagnosis after considering the discussions, 2) The devil's advocate and correct confirmation and anchoring bias, 3) The tutor and facilitator of the discussion to reduce premature closure bias, and 4) To record and summarize the findings. A total of 80 simulations were evaluated for the accuracy of initial diagnosis, top differential diagnosis and final two differential diagnoses. Results: In a total of 80 responses evaluating both initial and final diagnoses, the initial diagnosis had an accuracy of 0% (0/80), but following multi-agent discussions, the accuracy for the top differential diagnosis increased to 71.3% (57/80), and for the final two differential diagnoses, to 80.0% (64/80). Conclusions: The framework demonstrated an ability to re-evaluate and correct misconceptions, even in scenarios with misleading initial investigations. The LLM-driven multi-agent conversation framework shows promise in enhancing diagnostic accuracy in diagnostically challenging medical scenarios.

相關內容

模型評估

關注 1730

機器學習(xi)系統設計系統評(ping)估標準

控制器 · Automator · Integration · INTERACT · TOOLS ·

2024 年 6 月 25 日

From Text to Test: AI-Generated Control Software for Materials Science Instruments

Davi M Fébba,Kingsley Egbo,William A. Callahan,Andriy Zakutayev

Large language models (LLMs) are transforming the landscape of chemistry and materials science. Recent examples of LLM-accelerated experimental research include virtual assistants for parsing synthesis recipes from the literature, or using the extracted knowledge to guide synthesis and characterization. Despite these advancements, their application is constrained to labs with automated instruments and control software, leaving much of materials science reliant on manual processes. Here, we demonstrate the rapid deployment of a Python-based control module for a Keithley 2400 electrical source measure unit using ChatGPT-4. Through iterative refinement, we achieved effective instrument management with minimal human intervention. Additionally, a user-friendly graphical user interface (GUI) was created, effectively linking all instrument controls to interactive screen elements. Finally, we integrated this AI-crafted instrument control software with a high-performance stochastic optimization algorithm to facilitate rapid and automated extraction of electronic device parameters related to semiconductor charge transport mechanisms from current-voltage (IV) measurement data. This integration resulted in a comprehensive open-source toolkit for semiconductor device characterization and analysis using IV curve measurements. We demonstrate the application of these tools by acquiring, analyzing, and parameterizing IV data from a Pt/Cr$_2$O$_3$:Mg/$\beta$-Ga$_2$O$_3$ heterojunction diode, a novel stack for high-power and high-temperature electronic devices. This approach underscores the powerful synergy between LLMs and the development of instruments for scientific inquiry, showcasing a path for further acceleration in materials science.

圖像分割 · MoDELS · 集成 · 3D · Automator ·

2024 年 6 月 24 日

QUBIQ: Uncertainty Quantification for Biomedical Image Segmentation Challenge

Hongwei Bran Li,Fernando Navarro,Ivan Ezhov,Amirhossein Bayat,Dhritiman Das,Florian Kofler,Suprosanna Shit,Diana Waldmannstetter,Johannes C. Paetzold,Xiaobin Hu,Benedikt Wiestler,Lucas Zimmer,Tamaz Amiranashvili,Chinmay Prabhakar,Christoph Berger,Jonas Weidner,Michelle Alonso-Basant,Arif Rashid,Ujjwal Baid,Wesam Adel,Deniz Ali,Bhakti Baheti,Yingbin Bai,Ishaan Bhatt,Sabri Can Cetindag,Wenting Chen,Li Cheng,Prasad Dutand,Lara Dular,Mustafa A. Elattar,Ming Feng,Shengbo Gao,Henkjan Huisman,Weifeng Hu,Shubham Innani,Wei Jiat,Davood Karimi,Hugo J. Kuijf,Jin Tae Kwak,Hoang Long Le,Xiang Lia,Huiyan Lin,Tongliang Liu,Jun Ma,Kai Ma,Ting Ma,Ilkay Oksuz,Robbie Holland,Arlindo L. Oliveira,Jimut Bahan Pal,Xuan Pei,Maoying Qiao,Anindo Saha,Raghavendra Selvan,Linlin Shen,Joao Lourenco Silva,Ziga Spiclin,Sanjay Talbar,Dadong Wang,Wei Wang,Xiong Wang,Yin Wang,Ruiling Xia,Kele Xu,Yanwu Yan,Mert Yergin,Shuang Yu,Lingxi Zeng,YingLin Zhang,Jiachen Zhao,Yefeng Zheng,Martin Zukovec,Richard Do,Anton Becker,Amber Simpson,Ender Konukoglu,Andras Jakab,Spyridon Bakas,Leo Joskowicz,Bjoern Menze

from arxiv, initial technical report

Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the development and evaluation of automated segmentation algorithms. Accurately modeling and quantifying this variability is essential for enhancing the robustness and clinical applicability of these algorithms. We report the set-up and summarize the benchmark results of the Quantification of Uncertainties in Biomedical Image Quantification Challenge (QUBIQ), which was organized in conjunction with International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2020 and 2021. The challenge focuses on the uncertainty quantification of medical image segmentation which considers the omnipresence of inter-rater variability in imaging datasets. The large collection of images with multi-rater annotations features various modalities such as MRI and CT; various organs such as the brain, prostate, kidney, and pancreas; and different image dimensions 2D-vs-3D. A total of 24 teams submitted different solutions to the problem, combining various baseline models, Bayesian neural networks, and ensemble model techniques. The obtained results indicate the importance of the ensemble models, as well as the need for further research to develop efficient 3D methods for uncertainty quantification methods in 3D segmentation tasks.

Integration · Facebook AI Research · Automator · Processing（編程語言） · Boosting（一種模型訓練加速方式） ·

2024 年 6 月 21 日

Full-Scale Indexing and Semantic Annotation of CT Imaging: Boosting FAIRness

Hannes Ulrich,Robin Hendel,Santiago Pazmino,Bj?rn Bergh,Bj?rn Schreiweis

Background: The integration of artificial intelligence into medicine has led to significant advances, particularly in diagnostics and treatment planning. However, the reliability of AI models is highly dependent on the quality of the training data, especially in medical imaging, where varying patient data and evolving medical knowledge pose a challenge to the accuracy and generalizability of given datasets. Results: The proposed approach focuses on the integration and enhancement of clinical computed tomography (CT) image series for better findability, accessibility, interoperability, and reusability. Through an automated indexing process, CT image series are semantically enhanced using the TotalSegmentator framework for segmentation and resulting SNOMED CT annotations. The metadata is standardized with HL7 FHIR resources to enable efficient data recognition and data exchange between research projects. Conclusions: The study successfully integrates a robust process within the UKSH MeDIC, leading to the semantic enrichment of over 230,000 CT image series and over 8 million SNOMED CT annotations. The standardized representation using HL7 FHIR resources improves discoverability and facilitates interoperability, providing a foundation for the FAIRness of medical imaging data. However, developing automated annotation methods that can keep pace with growing clinical datasets remains a challenge to ensure continued progress in large-scale integration and indexing of medical imaging for advanced healthcare AI applications.

視覺問答 · 自動問答 · Analysis · motivation · HTTPS ·

2024 年 6 月 21 日

Tri-VQA: Triangular Reasoning Medical Visual Question Answering for Multi-Attribute Analysis

Lin Fan,Xun Gong,Cenyang Zheng,Yafei Ou

The intersection of medical Visual Question Answering (Med-VQA) is a challenging research topic with advantages including patient engagement and clinical expert involvement for second opinions. However, existing Med-VQA methods based on joint embedding fail to explain whether their provided results are based on correct reasoning or coincidental answers, which undermines the credibility of VQA answers. In this paper, we investigate the construction of a more cohesive and stable Med-VQA structure. Motivated by causal effect, we propose a novel Triangular Reasoning VQA (Tri-VQA) framework, which constructs reverse causal questions from the perspective of "Why this answer?" to elucidate the source of the answer and stimulate more reasonable forward reasoning processes. We evaluate our method on the Endoscopic Ultrasound (EUS) multi-attribute annotated dataset from five centers, and test it on medical VQA datasets. Experimental results demonstrate the superiority of our approach over existing methods. Our codes and pre-trained models are available at //anonymous.4open.science/r/Tri_VQA.

磁流變材料 · Analysis · 潛在 · INFORMS · Performer ·

2024 年 6 月 21 日

A Unified Framework for Synthesizing Multisequence Brain MRI via Hybrid Fusion

Jihoon Cho,Jonghye Woo,Jinah Park

from arxiv, 11 pages, 7 figures

Multisequence Magnetic Resonance Imaging (MRI) provides a reliable diagnosis in clinical applications through complementary information within sequences. However, in practice, the absence of certain MR sequences is a common problem that can lead to inconsistent analysis results. In this work, we propose a novel unified framework for synthesizing multisequence MR images, called Hybrid Fusion GAN (HF-GAN). We introduce a hybrid fusion encoder designed to ensure the disentangled extraction of complementary and modality-specific information, along with a channel attention-based feature fusion module that integrates the features into a common latent space handling the complexity from combinations of accessible MR sequences. Common feature representations are transformed into a target latent space via the modality infuser to synthesize missing MR sequences. We have performed experiments on multisequence brain MRI datasets from healthy individuals and patients diagnosed with brain tumors. Experimental results show that our method outperforms state-of-the-art methods in both quantitative and qualitative comparisons. In addition, a detailed analysis of our framework demonstrates the superiority of our designed modules and their effectiveness for use in data imputation tasks.

Processing（編程語言） · Performer · MoDELS · Cognition · 大語言模型 ·

2024 年 6 月 20 日

ArgMed-Agents: Explainable Clinical Decision Reasoning with LLM Disscusion via Argumentation Schemes

Shengxin Hong,Liang Xiao,Xin Zhang,Jianxia Chen

There are two main barriers to using large language models (LLMs) in clinical reasoning. Firstly, while LLMs exhibit significant promise in Natural Language Processing (NLP) tasks, their performance in complex reasoning and planning falls short of expectations. Secondly, LLMs use uninterpretable methods to make clinical decisions that are fundamentally different from the clinician's cognitive processes. This leads to user distrust. In this paper, we present a multi-agent framework called ArgMed-Agents, which aims to enable LLM-based agents to make explainable clinical decision reasoning through interaction. ArgMed-Agents performs self-argumentation iterations via Argumentation Scheme for Clinical Discussion (a reasoning mechanism for modeling cognitive processes in clinical reasoning), and then constructs the argumentation process as a directed graph representing conflicting relationships. Ultimately, use symbolic solver to identify a series of rational and coherent arguments to support decision. We construct a formal model of ArgMed-Agents and present conjectures for theoretical guarantees. ArgMed-Agents enables LLMs to mimic the process of clinical argumentative reasoning by generating explanations of reasoning in a self-directed manner. The setup experiments show that ArgMed-Agents not only improves accuracy in complex clinical decision reasoning problems compared to other prompt methods, but more importantly, it provides users with decision explanations that increase their confidence.

有偏 · Cognition · Analysis · AI · Chatbot ·

2024 年 6 月 19 日

The Efficacy of Conversational Artificial Intelligence in Rectifying the Theory of Mind and Autonomy Biases: Comparative Analysis

Marcin Rz?deczka,Anna Sterna,Julia Stolińska,Paulina Kaczyńska,Marcin Moskalewicz

from arxiv, 28 pages, 5 tables, 6 figures

The study evaluates the efficacy of Conversational Artificial Intelligence (CAI) in rectifying cognitive biases and recognizing affect in human-AI interactions, which is crucial for digital mental health interventions. Cognitive biases (systematic deviations from normative thinking) affect mental health, intensifying conditions like depression and anxiety. Therapeutic chatbots can make cognitive-behavioral therapy (CBT) more accessible and affordable, offering scalable and immediate support. The research employs a structured methodology with clinical-based virtual case scenarios simulating typical user-bot interactions. Performance and affect recognition were assessed across two categories of cognitive biases: theory of mind biases (anthropomorphization of AI, overtrust in AI, attribution to AI) and autonomy biases (illusion of control, fundamental attribution error, just-world hypothesis). A qualitative feedback mechanism was used with an ordinal scale to quantify responses based on accuracy, therapeutic quality, and adherence to CBT principles. Therapeutic bots (Wysa, Youper) and general-use LLMs (GTP 3.5, GTP 4, Gemini Pro) were evaluated through scripted interactions, double-reviewed by cognitive scientists and a clinical psychologist. Statistical analysis showed therapeutic bots were consistently outperformed by non-therapeutic bots in bias rectification and in 4 out of 6 biases in affect recognition. The data suggests that non-therapeutic chatbots are more effective in addressing some cognitive biases.

Performer · Analysis · Medical Image Analysis · MoDELS · Learning ·

2024 年 6 月 17 日

Slicing Through Bias: Explaining Performance Gaps in Medical Image Analysis using Slice Discovery Methods

Vincent Olesen,Nina Weng,Aasa Feragen,Eike Petersen

Machine learning models have achieved high overall accuracy in medical image analysis. However, performance disparities on specific patient groups pose challenges to their clinical utility, safety, and fairness. This can affect known patient groups - such as those based on sex, age, or disease subtype - as well as previously unknown and unlabeled groups. Furthermore, the root cause of such observed performance disparities is often challenging to uncover, hindering mitigation efforts. In this paper, to address these issues, we leverage Slice Discovery Methods (SDMs) to identify interpretable underperforming subsets of data and formulate hypotheses regarding the cause of observed performance disparities. We introduce a novel SDM and apply it in a case study on the classification of pneumothorax and atelectasis from chest x-rays. Our study demonstrates the effectiveness of SDMs in hypothesis formulation and yields an explanation of previously observed but unexplained performance disparities between male and female patients in widely used chest X-ray datasets and models. Our findings indicate shortcut learning in both classification tasks, through the presence of chest drains and ECG wires, respectively. Sex-based differences in the prevalence of these shortcut features appear to cause the observed classification performance gap, representing a previously underappreciated interaction between shortcut learning and model fairness analyses.

試驗 · 近似 · 可約的 · SimPLe · FAST ·

2024 年 6 月 17 日

Predictive Probabilities Made Simple: A Fast and Accurate Method for Clinical Trial Decision Making

Joe Marion,Liz Lorenzi,Cora Allen-Savietta,Scott Berry,Kert Viele

from arxiv, under review

Bayesian predictive probabilities are commonly used for interim monitoring of clinical trials through efficacy and futility stopping rules. Despite their usefulness, calculation of predictive probabilities, particularly in pre-experiment trial simulation, can be a significant challenge. We introduce an approximation for computing predictive probabilities using either a p-value or a posterior probability that significantly reduces this burden. We show the approximation has a high degree of concordance with standard Monte Carlo imputation methods for computing predictive probabilities, and present five simulation studies comparing the approximation to the full predictive probability for a range of primary analysis strategies: dichotomous, time-to-event, and ordinal endpoints, as well as historical borrowing and longitudinal modeling. We find that this faster method of predictive probability approximation works well in all five applications, thus significantly reducing the computational burden of trial simulation, allowing more virtual trials to be simulated to achieve greater precision in estimating trial operating characteristics.

Cognition · Performer · Agent · 知識 (knowledge) · MoDELS ·

2023 年 7 月 14 日

Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration

Zhenhailong Wang,Shaoguang Mao,Wenshan Wu,Tao Ge,Furu Wei,Heng Ji

from arxiv, work in progress

Human intelligence thrives on the concept of cognitive synergy, where collaboration and information integration among different cognitive processes yield superior outcomes compared to individual cognitive processes in isolation. Although Large Language Models (LLMs) have demonstrated promising performance as general task-solving agents, they still struggle with tasks that require intensive domain knowledge and complex reasoning. In this work, we propose Solo Performance Prompting (SPP), which transforms a single LLM into a cognitive synergist by engaging in multi-turn self-collaboration with multiple personas. A cognitive synergist refers to an intelligent agent that collaborates with multiple minds, combining their individual strengths and knowledge, to enhance problem-solving and overall performance in complex tasks. By dynamically identifying and simulating different personas based on task inputs, SPP unleashes the potential of cognitive synergy in LLMs. We have discovered that assigning multiple, fine-grained personas in LLMs elicits better problem-solving abilities compared to using a single or fixed number of personas. We evaluate SPP on three challenging tasks: Trivia Creative Writing, Codenames Collaborative, and Logic Grid Puzzle, encompassing both knowledge-intensive and reasoning-intensive types. Unlike previous works, such as Chain-of-Thought, that solely enhance the reasoning abilities in LLMs, SPP effectively elicits internal knowledge acquisition abilities, reduces hallucination, and maintains strong reasoning capabilities. Code, data, and prompts can be found at: //github.com/MikeWangWZHL/Solo-Performance-Prompting.git.