露脸视频一区二区三区在线播放_女女啪啪激烈高潮喷出网站免费_蘑菇视频免费版在线观看版下载_一区二区三区高清在线播放_456欧美视频在线_久久99久久99精品免观看女同_艹B在线观看毛片一区二区三区

Debadutta Dash,Rahul Thapa,Juan M. Banda,Akshay Swaminathan,Morgan Cheatham,Mehr Kashyap,Nikesh Kotecha,Jonathan H. Chen,Saurabh Gombar,Lance Downing,Rachel Pedreira,Ethan Goh,Angel Arnaout,Garret Kenn Morris,Honor Magon,Matthew P Lungren,Eric Horvitz,Nigam H. Shah

from arxiv, 27 pages including supplemental information

Despite growing interest in using large language models (LLMs) in healthcare, current explorations do not assess the real-world utility and safety of LLMs in clinical settings. Our objective was to determine whether two LLMs can serve information needs submitted by physicians as questions to an informatics consultation service in a safe and concordant manner. Sixty six questions from an informatics consult service were submitted to GPT-3.5 and GPT-4 via simple prompts. 12 physicians assessed the LLM responses' possibility of patient harm and concordance with existing reports from an informatics consultation service. Physician assessments were summarized based on majority vote. For no questions did a majority of physicians deem either LLM response as harmful. For GPT-3.5, responses to 8 questions were concordant with the informatics consult report, 20 discordant, and 9 were unable to be assessed. There were 29 responses with no majority on "Agree", "Disagree", and "Unable to assess". For GPT-4, responses to 13 questions were concordant, 15 discordant, and 3 were unable to be assessed. There were 35 responses with no majority. Responses from both LLMs were largely devoid of overt harm, but less than 20% of the responses agreed with an answer from an informatics consultation service, responses contained hallucinated references, and physicians were divided on what constitutes harm. These results suggest that while general purpose LLMs are able to provide safe and credible responses, they often do not meet the specific information need of a given question. A definitive evaluation of the usefulness of LLMs in healthcare settings will likely require additional research on prompt engineering, calibration, and custom-tailoring of general purpose models.

相關內容

INFORMS

關注 10

《計算機信息》雜志發表高質量的論文，擴大了運籌學和計算的范圍，尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文，以及描述新的和有用的軟件工具的論文。官網鏈接： · 環 · 模型評估 · 可約的 · Extensibility ·

2023 年 6 月 8 日

Scalable and Adaptive Log-based Anomaly Detection with Expert in the Loop

Jinyang Liu,Junjie Huang,Yintong Huo,Zhihan Jiang,Jiazhen Gu,Zhuangbin Chen,Cong Feng,Minzhi Yan,Michael R. Lyu

System logs play a critical role in maintaining the reliability of software systems. Fruitful studies have explored automatic log-based anomaly detection and achieved notable accuracy on benchmark datasets. However, when applied to large-scale cloud systems, these solutions face limitations due to high resource consumption and lack of adaptability to evolving logs. In this paper, we present an accurate, lightweight, and adaptive log-based anomaly detection framework, referred to as SeaLog. Our method introduces a Trie-based Detection Agent (TDA) that employs a lightweight, dynamically-growing trie structure for real-time anomaly detection. To enhance TDA's accuracy in response to evolving log data, we enable it to receive feedback from experts. Interestingly, our findings suggest that contemporary large language models, such as ChatGPT, can provide feedback with a level of consistency comparable to human experts, which can potentially reduce manual verification efforts. We extensively evaluate SeaLog on two public datasets and an industrial dataset. The results show that SeaLog outperforms all baseline methods in terms of effectiveness, runs 2X to 10X faster and only consumes 5% to 41% of the memory resource.

HTTPS · 分解的 · 機器人 · 回合 · 可行 ·

2023 年 6 月 8 日

Zero-Shot Transfer of Haptics-Based Object Insertion Policies

Samarth Brahmbhatt,Ankur Deka,Andrew Spielberg,Matthias Müller

from arxiv, Accepted for publication at 2023 IEEE International Conference on Robotics and Automation (ICRA)

Humans naturally exploit haptic feedback during contact-rich tasks like loading a dishwasher or stocking a bookshelf. Current robotic systems focus on avoiding unexpected contact, often relying on strategically placed environment sensors. Recently, contact-exploiting manipulation policies have been trained in simulation and deployed on real robots. However, they require some form of real-world adaptation to bridge the sim-to-real gap, which might not be feasible in all scenarios. In this paper we train a contact-exploiting manipulation policy in simulation for the contact-rich household task of loading plates into a slotted holder, which transfers without any fine-tuning to the real robot. We investigate various factors necessary for this zero-shot transfer, like time delay modeling, memory representation, and domain randomization. Our policy transfers with minimal sim-to-real gap and significantly outperforms heuristic and learnt baselines. It also generalizes to plates of different sizes and weights. Demonstration videos and code are available at //sites.google.com/view/compliant-object-insertion.

INTERACT · AI · 道德化 · INFORMS · 有向 ·

2023 年 6 月 7 日

Artificial Intelligence can facilitate selfish decisions by altering the appearance of interaction partners

Nils K?bis,Philipp Lorenz-Spreen,Tamer Ajaj,Jean-Francois Bonnefon,Ralph Hertwig,Iyad Rahwan

The increasing prevalence of image-altering filters on social media and video conferencing technologies has raised concerns about the ethical and psychological implications of using Artificial Intelligence (AI) to manipulate our perception of others. In this study, we specifically investigate the potential impact of blur filters, a type of appearance-altering technology, on individuals' behavior towards others. Our findings consistently demonstrate a significant increase in selfish behavior directed towards individuals whose appearance is blurred, suggesting that blur filters can facilitate moral disengagement through depersonalization. These results emphasize the need for broader ethical discussions surrounding AI technologies that modify our perception of others, including issues of transparency, consent, and the awareness of being subject to appearance manipulation by others. We also emphasize the importance of anticipatory experiments in informing the development of responsible guidelines and policies prior to the widespread adoption of such technologies.

Chatbot · GPT-3 · 分離的 · Excel · 縮放 ·

2023 年 6 月 7 日

Personality testing of GPT-3: Limited temporal reliability, but highlighted social desirability of GPT-3's personality instruments results

Bojana Bodroza,Bojana M. Dinic,Ljubisa Bojic

from arxiv, 18 pages, 1 table

To assess the potential applications and limitations of chatbot GPT-3 Davinci-003, this study explored the temporal reliability of personality questionnaires applied to the chatbot and its personality profile. Psychological questionnaires were administered to the chatbot on two separate occasions, followed by a comparison of the responses to human normative data. The findings revealed varying levels of agreement in the chatbot's responses over time, with some scales displaying excellent while others demonstrated poor agreement. Overall, Davinci-003 displayed a socially desirable and pro-social personality profile, particularly in the domain of communion. However, the underlying basis of the chatbot's responses, whether driven by conscious self-reflection or predetermined algorithms, remains uncertain.

Processing（編程語言） · Learning · 推斷 · 在線 · 最優化 ·

2023 年 6 月 7 日

Timing Process Interventions with Causal Inference and Reinforcement Learning

Hans Weytjens,Wouter Verbeke,Jochen De Weerdt

The shift from the understanding and prediction of processes to their optimization offers great benefits to businesses and other organizations. Precisely timed process interventions are the cornerstones of effective optimization. Prescriptive process monitoring (PresPM) is the sub-field of process mining that concentrates on process optimization. The emerging PresPM literature identifies state-of-the-art methods, causal inference (CI) and reinforcement learning (RL), without presenting a quantitative comparison. Most experiments are carried out using historical data, causing problems with the accuracy of the methods' evaluations and preempting online RL. Our contribution consists of experiments on timed process interventions with synthetic data that renders genuine online RL and the comparison to CI possible, and allows for an accurate evaluation of the results. Our experiments reveal that RL's policies outperform those from CI and are more robust at the same time. Indeed, the RL policies approach perfect policies. Unlike CI, the unaltered online RL approach can be applied to other, more generic PresPM problems such as next best activity recommendations. Nonetheless, CI has its merits in settings where online learning is not an option.

區塊鏈 · Integration · 可辨認的 · 可理解性 · COVID-19 ·

2023 年 6 月 7 日

Blockchain Technology in Higher Education Ecosystem: Unraveling the Good, Bad, and Ugly

Sharaban Tahora,Bilash Saha,Nazmus Sakib,Hossain Shahriar,Hisham Haddad

The higher education management systems first identified and realized the trap of pitting innovation against privacy while first addressing COVID-19 social isolation challenges in 2020. In the age of data sprawl, we observe the situation has been exacerbating since then. Integrating blockchain technology has the potential to address the recent and emerging challenges in the higher education management system. This paper unravels the Good (scopes and benefits), Bad (limitations), and Ugly (challenges and trade-offs) of blockchain technology integration in the higher education management paradigm in the existing landscape. Our study adopts both qualitative and quantitative approaches to explore the experiences of educators, researchers, students, and other stakeholders and fully understand the blockchain's potential and contextual challenges. Our findings will envision an efficient, secure, and transparent higher education management system and help shape the debate (and trade-offs) pertaining to the recent shift in relevant business and management climate and regulatory sentiment.

Med-PaLM 2 · Performer · 語言模型化 · MoDELS · 自動問答 ·

2023 年 5 月 16 日

Towards Expert-Level Medical Question Answering with Large Language Models

Karan Singhal,Tao Tu,Juraj Gottweis,Rory Sayres,Ellery Wulczyn,Le Hou,Kevin Clark,Stephen Pfohl,Heather Cole-Lewis,Darlene Neal,Mike Schaekermann,Amy Wang,Mohamed Amin,Sami Lachgar,Philip Mansfield,Sushant Prakash,Bradley Green,Ewa Dominowska,Blaise Aguera y Arcas,Nenad Tomasev,Yun Liu,Renee Wong,Christopher Semturs,S. Sara Mahdavi,Joelle Barral,Dale Webster,Greg S. Corrado,Yossi Matias,Shekoofeh Azizi,Alan Karthikesalingam,Vivek Natarajan

Recent artificial intelligence (AI) systems have reached milestones in "grand challenges" ranging from Go to protein-folding. The capability to retrieve medical knowledge, reason over it, and answer medical questions comparably to physicians has long been viewed as one such grand challenge. Large language models (LLMs) have catalyzed significant progress in medical question answering; Med-PaLM was the first model to exceed a "passing" score in US Medical Licensing Examination (USMLE) style questions with a score of 67.2% on the MedQA dataset. However, this and other prior work suggested significant room for improvement, especially when models' answers were compared to clinicians' answers. Here we present Med-PaLM 2, which bridges these gaps by leveraging a combination of base LLM improvements (PaLM 2), medical domain finetuning, and prompting strategies including a novel ensemble refinement approach. Med-PaLM 2 scored up to 86.5% on the MedQA dataset, improving upon Med-PaLM by over 19% and setting a new state-of-the-art. We also observed performance approaching or exceeding state-of-the-art across MedMCQA, PubMedQA, and MMLU clinical topics datasets. We performed detailed human evaluations on long-form questions along multiple axes relevant to clinical applications. In pairwise comparative ranking of 1066 consumer medical questions, physicians preferred Med-PaLM 2 answers to those produced by physicians on eight of nine axes pertaining to clinical utility (p < 0.001). We also observed significant improvements compared to Med-PaLM on every evaluation axis (p < 0.001) on newly introduced datasets of 240 long-form "adversarial" questions to probe LLM limitations. While further studies are necessary to validate the efficacy of these models in real-world settings, these results highlight rapid progress towards physician-level performance in medical question answering.

評論員 · Cognition · 多樣性 · Prompt · 講稿 ·

2022 年 11 月 21 日

Intelligent Computing: The Latest Advances, Challenges and Future

Shiqiang Zhu,Ting Yu,Tao Xu,Hongyang Chen,Schahram Dustdar,Sylvain Gigan,Deniz Gunduz,Ekram Hossain,Yaochu Jin,Feng Lin,Bo Liu,Zhiguo Wan,Ji Zhang,Zhifeng Zhao,Wentao Zhu,Zuoning Chen,Tariq Durrani,Huaimin Wang,Jiangxing Wu,Tongyi Zhang,Yunhe Pan

Computing is a critical driving force in the development of human civilization. In recent years, we have witnessed the emergence of intelligent computing, a new computing paradigm that is reshaping traditional computing and promoting digital revolution in the era of big data, artificial intelligence and internet-of-things with new computing theories, architectures, methods, systems, and applications. Intelligent computing has greatly broadened the scope of computing, extending it from traditional computing on data to increasingly diverse computing paradigms such as perceptual intelligence, cognitive intelligence, autonomous intelligence, and human-computer fusion intelligence. Intelligence and computing have undergone paths of different evolution and development for a long time but have become increasingly intertwined in recent years: intelligent computing is not only intelligence-oriented but also intelligence-driven. Such cross-fertilization has prompted the emergence and rapid advancement of intelligent computing. Intelligent computing is still in its infancy and an abundance of innovations in the theories, systems, and applications of intelligent computing are expected to occur soon. We present the first comprehensive survey of literature on intelligent computing, covering its theory fundamentals, the technological fusion of intelligence and computing, important applications, challenges, and future perspectives. We believe that this survey is highly timely and will provide a comprehensive reference and cast valuable insights into intelligent computing for academic and industrial researchers and practitioners.

圖 · 可理解性 · Seven · MoDELS · 鏈路預測 ·

2022 年 10 月 21 日

A Survey on Graph Counterfactual Explanations: Definitions, Methods, Evaluation

Mario Alfonso Prado-Romero,Bardh Prenkaj,Giovanni Stilo,Fosca Giannotti

from arxiv, arXiv admin note: text overlap with arXiv:2107.04086 by other authors

In recent years, Graph Neural Networks have reported outstanding performance in tasks like community detection, molecule classification and link prediction. However, the black-box nature of these models prevents their application in domains like health and finance, where understanding the models' decisions is essential. Counterfactual Explanations (CE) provide these understandings through examples. Moreover, the literature on CE is flourishing with novel explanation methods which are tailored to graph learning. In this survey, we analyse the existing Graph Counterfactual Explanation methods, by providing the reader with an organisation of the literature according to a uniform formal notation for definitions, datasets, and metrics, thus, simplifying potential comparisons w.r.t to the method advantages and disadvantages. We discussed seven methods and sixteen synthetic and real datasets providing details on the possible generation strategies. We highlight the most common evaluation strategies and formalise nine of the metrics used in the literature. We first introduce the evaluation framework GRETEL and how it is possible to extend and use it while providing a further dimension of comparison encompassing reproducibility aspects. Finally, we provide a discussion on how counterfactual explanation interplays with privacy and fairness, before delving into open challenges and future works.

Next · Integration · 有向 · 控制器 · Continuity ·

2022 年 3 月 5 日

AI for Next Generation Computing: Emerging Trends and Future Directions

Sukhpal Singh Gill,Minxian Xu,Carlo Ottaviani,Panos Patros,Rami Bahsoon,Arash Shaghaghi,Muhammed Golec,Vlado Stankovski,Huaming Wu,Ajith Abraham,Manmeet Singh,Harshit Mehta,Soumya K. Ghosh,Thar Baker,Ajith Kumar Parlikad,Hanan Lutfiyya,Salil S. Kanhere,Rizos Sakellariou,Schahram Dustdar,Omer Rana,Ivona Brandic,Steve Uhlig

from arxiv, Accepted for Publication in Elsevier IoT Journal, 2022

Autonomic computing investigates how systems can achieve (user) specified control outcomes on their own, without the intervention of a human operator. Autonomic computing fundamentals have been substantially influenced by those of control theory for closed and open-loop systems. In practice, complex systems may exhibit a number of concurrent and inter-dependent control loops. Despite research into autonomic models for managing computer resources, ranging from individual resources (e.g., web servers) to a resource ensemble (e.g., multiple resources within a data center), research into integrating Artificial Intelligence (AI) and Machine Learning (ML) to improve resource autonomy and performance at scale continues to be a fundamental challenge. The integration of AI/ML to achieve such autonomic and self-management of systems can be achieved at different levels of granularity, from full to human-in-the-loop automation. In this article, leading academics, researchers, practitioners, engineers, and scientists in the fields of cloud computing, AI/ML, and quantum computing join to discuss current research and potential future directions for these fields. Further, we discuss challenges and opportunities for leveraging AI and ML in next generation computing for emerging computing paradigms, including cloud, fog, edge, serverless and quantum computing environments.