国产一区二区高清无码,性爱视频免费试看网站

Large language models like GPT-3.5-turbo and GPT-4 hold promise for healthcare professionals, but they may inadvertently inherit biases during their training, potentially affecting their utility in medical applications. Despite few attempts in the past, the precise impact and extent of these biases remain uncertain. Through both qualitative and quantitative analyses, we find that these models tend to project higher costs and longer hospitalizations for White populations and exhibit optimistic views in challenging medical scenarios with much higher survival rates. These biases, which mirror real-world healthcare disparities, are evident in the generation of patient backgrounds, the association of specific diseases with certain races, and disparities in treatment recommendations, etc. Our findings underscore the critical need for future research to address and mitigate biases in language models, especially in critical healthcare applications, to ensure fair and accurate outcomes for all patients.

相關內容

有偏

關注 0

估計/估計量 · Performer · Performance · Better · 方差 ·

2024 年 3 月 6 日

Using Causal Trees to Estimate Personalized Task Difficulty in Post-Stroke Individuals

Nathaniel Dennler,Stefanos Nikolaidis,Maja Matari?

from arxiv, Accepted to the 2023 IROS Workshop on Assistive Robots for Citizens

Adaptive training programs are crucial for recovery post stroke. However, developing programs that automatically adapt depends on quantifying how difficult a task is for a specific individual at a particular stage of their recovery. In this work, we propose a method that automatically generates regions of different task difficulty levels based on an individual's performance. We show that this technique explains the variance in user performance for a reaching task better than previous approaches to estimating task difficulty.

MICRO · Performer · 向量化 · 可行 · 知識 (knowledge) ·

2024 年 3 月 6 日

Exploring Jamming and Hijacking Attacks for Micro Aerial Drones

Yassine Mekdad,Abbas Acar,Ahmet Aris,Abdeslam El Fergougui,Mauro Conti,Riccardo Lazzeretti,Selcuk Uluagac

from arxiv, Accepted at IEEE International Conference on Communications (ICC) 2024

Recent advancements in drone technology have shown that commercial off-the-shelf Micro Aerial Drones are more effective than large-sized drones for performing flight missions in narrow environments, such as swarming, indoor navigation, and inspection of hazardous locations. Due to their deployments in many civilian and military applications, safe and reliable communication of these drones throughout the mission is critical. The Crazyflie ecosystem is one of the most popular Micro Aerial Drones and has the potential to be deployed worldwide. In this paper, we empirically investigate two interference attacks against the Crazy Real Time Protocol (CRTP) implemented within the Crazyflie drones. In particular, we explore the feasibility of experimenting two attack vectors that can disrupt an ongoing flight mission: the jamming attack, and the hijacking attack. Our experimental results demonstrate the effectiveness of such attacks in both autonomous and non-autonomous flight modes on a Crazyflie 2.1 drone. Finally, we suggest potential shielding strategies that guarantee a safe and secure flight mission. To the best of our knowledge, this is the first work investigating jamming and hijacking attacks against Micro Aerial Drones, both in autonomous and non-autonomous modes.

MoDELS · 語言模型化 · 大語言模型 · BASIC · 自動問答 ·

2024 年 3 月 6 日

Evaluating the Elementary Multilingual Capabilities of Large Language Models with MultiQ

Carolin Holtermann,Paul R?ttger,Timm Dill,Anne Lauscher

Large language models (LLMs) need to serve everyone, including a global majority of non-English speakers. However, most LLMs today, and open LLMs in particular, are often intended for use in just English (e.g. Llama2, Mistral) or a small handful of high-resource languages (e.g. Mixtral, Qwen). Recent research shows that, despite limits in their intended use, people prompt LLMs in many different languages. Therefore, in this paper, we investigate the basic multilingual capabilities of state-of-the-art open LLMs beyond their intended use. For this purpose, we introduce MultiQ, a new silver standard benchmark for basic open-ended question answering with 27.4k test questions across a typologically diverse set of 137 languages. With MultiQ, we evaluate language fidelity, i.e.\ whether models respond in the prompted language, and question answering accuracy. All LLMs we test respond faithfully and/or accurately for at least some languages beyond their intended use. Most models are more accurate when they respond faithfully. However, differences across models are large, and there is a long tail of languages where models are neither accurate nor faithful. We explore differences in tokenization as a potential explanation for our findings, identifying possible correlations that warrant further investigation.

語言模型化 · 大語言模型 · MoDELS · Processing（編程語言） · INTERACT ·

2024 年 3 月 6 日

Towards Efficient and Effective Unlearning of Large Language Models for Recommendation

Hangyu Wang,Jianghao Lin,Bo Chen,Yang Yang,Ruiming Tang,Weinan Zhang,Yong Yu

from arxiv, 12 pages

The significant advancements in large language models (LLMs) give rise to a promising research direction, i.e., leveraging LLMs as recommenders (LLMRec). The efficacy of LLMRec arises from the open-world knowledge and reasoning capabilities inherent in LLMs. LLMRec acquires the recommendation capabilities through instruction tuning based on user interaction data. However, in order to protect user privacy and optimize utility, it is also crucial for LLMRec to intentionally forget specific user data, which is generally referred to as recommendation unlearning. In the era of LLMs, recommendation unlearning poses new challenges for LLMRec in terms of \textit{inefficiency} and \textit{ineffectiveness}. Existing unlearning methods require updating billions of parameters in LLMRec, which is costly and time-consuming. Besides, they always impact the model utility during the unlearning process. To this end, we propose \textbf{E2URec}, the first \underline{E}fficient and \underline{E}ffective \underline{U}nlearning method for LLM\underline{Rec}. Our proposed E2URec enhances the unlearning efficiency by updating only a few additional LoRA parameters, and improves the unlearning effectiveness by employing a teacher-student framework, where we maintain multiple teacher networks to guide the unlearning process. Extensive experiments show that E2URec outperforms state-of-the-art baselines on two real-world datasets. Specifically, E2URec can efficiently forget specific data without affecting recommendation performance. The source code is at \url{//github.com/justarter/E2URec}.

Continuity · Learning · INTERACT · 回合 · Performer ·

2024 年 3 月 6 日

Interactive Continual Learning Architecture for Long-Term Personalization of Home Service Robots

Ali Ayub,Chrystopher Nehaniv,Kerstin Dautenhahn

from arxiv, Accepted at the IEEE International Conference on Robotics and Automation (ICRA), 2024

For robots to perform assistive tasks in unstructured home environments, they must learn and reason on the semantic knowledge of the environments. Despite a resurgence in the development of semantic reasoning architectures, these methods assume that all the training data is available a priori. However, each user's environment is unique and can continue to change over time, which makes these methods unsuitable for personalized home service robots. Although research in continual learning develops methods that can learn and adapt over time, most of these methods are tested in the narrow context of object classification on static image datasets. In this paper, we combine ideas from continual learning, semantic reasoning, and interactive machine learning literature and develop a novel interactive continual learning architecture for continual learning of semantic knowledge in a home environment through human-robot interaction. The architecture builds on core cognitive principles of learning and memory for efficient and real-time learning of new knowledge from humans. We integrate our architecture with a physical mobile manipulator robot and perform extensive system evaluations in a laboratory environment over two months. Our results demonstrate the effectiveness of our architecture to allow a physical robot to continually adapt to the changes in the environment from limited data provided by the users (experimenters), and use the learned knowledge to perform object fetching tasks.

任務對話系統 · 可理解性 · ChatGPT · Analysis · 話題 ·

2024 年 3 月 5 日

Uncovering the Potential of ChatGPT for Discourse Analysis in Dialogue: An Empirical Study

Yaxin Fan,Feng Jiang,Peifeng Li,Haizhou Li

from arxiv, Accepted by LREC-COLING'2024

Large language models, like ChatGPT, have shown remarkable capability in many downstream tasks, yet their ability to understand discourse structures of dialogues remains less explored, where it requires higher level capabilities of understanding and reasoning. In this paper, we aim to systematically inspect ChatGPT's performance in two discourse analysis tasks: topic segmentation and discourse parsing, focusing on its deep semantic understanding of linear and hierarchical discourse structures underlying dialogue. To instruct ChatGPT to complete these tasks, we initially craft a prompt template consisting of the task description, output format, and structured input. Then, we conduct experiments on four popular topic segmentation datasets and two discourse parsing datasets. The experimental results showcase that ChatGPT demonstrates proficiency in identifying topic structures in general-domain conversations yet struggles considerably in specific-domain conversations. We also found that ChatGPT hardly understands rhetorical structures that are more complex than topic structures. Our deeper investigation indicates that ChatGPT can give more reasonable topic structures than human annotations but only linearly parses the hierarchical rhetorical structures. In addition, we delve into the impact of in-context learning (e.g., chain-of-thought) on ChatGPT and conduct the ablation study on various prompt components, which can provide a research foundation for future work. The code is available at \url{//github.com/yxfanSuda/GPTforDDA}.

VR · 虛擬現實（VR） · AIM · motivation · Integration ·

2024 年 3 月 4 日

Using Virtual Reality for Detection and Intervention of Depression -- A Systematic Literature Review

Mohammad Waqas,Y Pawankumar Gururaj,V D Shanmukha Mitra,Sai Anirudh Karri,Raghu Reddy,Syed Azeemuddin

from arxiv, 8 pages, 2 figures, 3 tables, Conference full paper

The use of emerging technologies like Virtual Reality (VR) in therapeutic settings has increased in the past few years. By incorporating VR, a mental health condition like depression can be assessed effectively, while also providing personalized motivation and meaningful engagement for treatment purposes. The integration of external sensors further enhances the engagement of the subjects with the VR scenes. This paper presents a comprehensive review of existing literature on the detection and treatment of depression using VR. It explores various types of VR scenes, external hardware, innovative metrics, and targeted user studies conducted by researchers and professionals in the field. The paper also discusses potential requirements for designing VR scenes specifically tailored for depression assessment and treatment, with the aim of guiding future practitioners in this area.

多峰值 · 語言模型化 · MoDELS · CLIP · 可辨認的 ·

2024 年 3 月 1 日

Mass-Producing Failures of Multimodal Systems with Language Models

Shengbang Tong,Erik Jones,Jacob Steinhardt

from arxiv, Under Review

Deployed multimodal systems can fail in ways that evaluators did not anticipate. In order to find these failures before deployment, we introduce MultiMon, a system that automatically identifies systematic failures -- generalizable, natural-language descriptions of patterns of model failures. To uncover systematic failures, MultiMon scrapes a corpus for examples of erroneous agreement: inputs that produce the same output, but should not. It then prompts a language model (e.g., GPT-4) to find systematic patterns of failure and describe them in natural language. We use MultiMon to find 14 systematic failures (e.g., "ignores quantifiers") of the CLIP text-encoder, each comprising hundreds of distinct inputs (e.g., "a shelf with a few/many books"). Because CLIP is the backbone for most state-of-the-art multimodal systems, these inputs produce failures in Midjourney 5.1, DALL-E, VideoFusion, and others. MultiMon can also steer towards failures relevant to specific use cases, such as self-driving cars. We see MultiMon as a step towards evaluation that autonomously explores the long tail of potential system failures. Code for MULTIMON is available at //github.com/tsb0601/MultiMon.

Continuity · 學成 · Vision · 計算機視覺 · 批量學習 ·

2021 年 9 月 23 日

Recent Advances of Continual Learning in Computer Vision: An Overview

Haoxuan Qu,Hossein Rahmani,Li Xu,Bryan Williams,Jun Liu

from arxiv, 21 pages, 5 figures

In contrast to batch learning where all training data is available at once, continual learning represents a family of methods that accumulate knowledge and learn continuously with data available in sequential order. Similar to the human learning process with the ability of learning, fusing, and accumulating new knowledge coming at different time steps, continual learning is considered to have high practical significance. Hence, continual learning has been studied in various artificial intelligence tasks. In this paper, we present a comprehensive review of the recent progress of continual learning in computer vision. In particular, the works are grouped by their representative techniques, including regularization, knowledge distillation, memory, generative replay, parameter isolation, and a combination of the above techniques. For each category of these techniques, both its characteristics and applications in computer vision are presented. At the end of this overview, several subareas, where continuous knowledge accumulation is potentially helpful while continual learning has not been well studied, are discussed.

Automator · AutoML · Machine Learning · 學成 · 可約的 ·

2019 年 1 月 17 日

Taking Human out of Learning Applications: A Survey on Automated Machine Learning

Quanming Yao,Mengshuo Wang,Yuqiang Chen,Wenyuan Dai,Hu Yi-Qi,Li Yu-Feng,Tu Wei-Wei,Yang Qiang,Yu Yang

from arxiv, This is a preliminary and will be kept updated

Machine learning techniques have deeply rooted in our everyday life. However, since it is knowledge- and labor-intensive to pursue good learning performance, human experts are heavily involved in every aspect of machine learning. In order to make machine learning techniques easier to apply and reduce the demand for experienced human experts, automated machine learning (AutoML) has emerged as a hot topic with both industrial and academic interest. In this paper, we provide an up to date survey on AutoML. First, we introduce and define the AutoML problem, with inspiration from both realms of automation and machine learning. Then, we propose a general AutoML framework that not only covers most existing approaches to date but also can guide the design for new methods. Subsequently, we categorize and review the existing works from two aspects, i.e., the problem setup and the employed techniques. Finally, we provide a detailed analysis of AutoML approaches and explain the reasons underneath their successful applications. We hope this survey can serve as not only an insightful guideline for AutoML beginners but also an inspiration for future research.