久久久久精品电影_女女啪啪激烈高潮喷出网站免费_又刺激又大又黄的免费视频_国国产乱理伦片在线观看夜_91在线国偷自产国产永久_一区二区三区四日韩无码_1313午夜精品久久午夜片

The academic intelligence of large language models (LLMs) has made remarkable progress in recent times, but their social intelligence performance remains unclear. Inspired by established human social intelligence frameworks, particularly Daniel Goleman's social intelligence theory, we have developed a standardized social intelligence test based on real-world social scenarios to comprehensively assess the social intelligence of LLMs, termed as the Situational Evaluation of Social Intelligence (SESI). We conducted an extensive evaluation with 13 recent popular and state-of-art LLM agents on SESI. The results indicate the social intelligence of LLMs still has significant room for improvement, with superficially friendliness as a primary reason for errors. Moreover, there exists a relatively low correlation between the social intelligence and academic intelligence exhibited by LLMs, suggesting that social intelligence is distinct from academic intelligence for LLMs. Additionally, while it is observed that LLMs can't ``understand'' what social intelligence is, their social intelligence, similar to that of humans, is influenced by social factors.

相關內容

大語言(yan)模型

關注 56

大語(yu)(yu)(yu)言(yan)模(mo)(mo)型(xing)是基(ji)于海(hai)量(liang)文(wen)本數(shu)據訓練的(de)(de)(de)(de)深度學習模(mo)(mo)型(xing)。它(ta)不(bu)僅能(neng)(neng)(neng)夠生成自然語(yu)(yu)(yu)言(yan)文(wen)本，還(huan)能(neng)(neng)(neng)夠深入(ru)理解文(wen)本含(han)義，處理各種自然語(yu)(yu)(yu)言(yan)任務(wu)，如文(wen)本摘要、問(wen)答、翻譯(yi)等。2023年(nian)(nian)，大語(yu)(yu)(yu)言(yan)模(mo)(mo)型(xing)及其(qi)在人(ren)(ren)(ren)工智能(neng)(neng)(neng)領(ling)域的(de)(de)(de)(de)應用(yong)已成為(wei)(wei)全球科技研(yan)究的(de)(de)(de)(de)熱點，其(qi)在規模(mo)(mo)上的(de)(de)(de)(de)增長尤為(wei)(wei)引人(ren)(ren)(ren)注目，參(can)數(shu)量(liang)已從最初的(de)(de)(de)(de)十幾億躍升到如今(jin)的(de)(de)(de)(de)一(yi)萬億。參(can)數(shu)量(liang)的(de)(de)(de)(de)提(ti)升使得模(mo)(mo)型(xing)能(neng)(neng)(neng)夠更加(jia)精(jing)細地捕捉(zhuo)人(ren)(ren)(ren)類(lei)語(yu)(yu)(yu)言(yan)微妙(miao)之處，更加(jia)深入(ru)地理解人(ren)(ren)(ren)類(lei)語(yu)(yu)(yu)言(yan)的(de)(de)(de)(de)復雜性(xing)。在過去(qu)的(de)(de)(de)(de)一(yi)年(nian)(nian)里，大語(yu)(yu)(yu)言(yan)模(mo)(mo)型(xing)在吸納新知識(shi)、分(fen)解復雜任務(wu)以(yi)及圖文(wen)對齊等多方面都有顯(xian)著(zhu)提(ti)升。隨著(zhu)技術的(de)(de)(de)(de)不(bu)斷成熟，它(ta)將不(bu)斷拓(tuo)展其(qi)應用(yong)范圍，為(wei)(wei)人(ren)(ren)(ren)類(lei)提(ti)供更加(jia)智能(neng)(neng)(neng)化(hua)和個(ge)性(xing)化(hua)的(de)(de)(de)(de)服(fu)務(wu)，進一(yi)步改善人(ren)(ren)(ren)們(men)的(de)(de)(de)(de)生活和生產方式。

prototype · Vision · MoDELS · 近鄰 · XAI ·

2024 年 4 月 23 日

CoProNN: Concept-based Prototypical Nearest Neighbors for Explaining Vision Models

Teodor Chiaburu,Frank Hau?er,Felix Bie?mann

from arxiv, 24 pages, 9 figures, 2 tables, accepted at WCXAI 2024 Valletta

Mounting evidence in explainability for artificial intelligence (XAI) research suggests that good explanations should be tailored to individual tasks and should relate to concepts relevant to the task. However, building task specific explanations is time consuming and requires domain expertise which can be difficult to integrate into generic XAI methods. A promising approach towards designing useful task specific explanations with domain experts is based on compositionality of semantic concepts. Here, we present a novel approach that enables domain experts to quickly create concept-based explanations for computer vision tasks intuitively via natural language. Leveraging recent progress in deep generative methods we propose to generate visual concept-based prototypes via text-to-image methods. These prototypes are then used to explain predictions of computer vision models via a simple k-Nearest-Neighbors routine. The modular design of CoProNN is simple to implement, it is straightforward to adapt to novel tasks and allows for replacing the classification and text-to-image models as more powerful models are released. The approach can be evaluated offline against the ground-truth of predefined prototypes that can be easily communicated also to domain experts as they are based on visual concepts. We show that our strategy competes very well with other concept-based XAI approaches on coarse grained image classification tasks and may even outperform those methods on more demanding fine grained tasks. We demonstrate the effectiveness of our method for human-machine collaboration settings in qualitative and quantitative user studies. All code and experimental data can be found in our GitHub $\href{//github.com/TeodorChiaburu/beexplainable}{repository}$.

GROUP · ChatGPT · INTERACT · 知識 (knowledge) · 任務對話系統 ·

2024 年 4 月 22 日

Investigation of the effectiveness of applying ChatGPT in Dialogic Teaching Using Electroencephalography

Jiayue Zhang,Yiheng Liu,Wenqi Cai,Lanlan Wu,Yali Peng,Jingjing Yu,Senqing Qi,Taotao Long,Bao Ge

In recent years, the rapid development of artificial intelligence technology, especially the emergence of large language models (LLMs) such as ChatGPT, has presented significant prospects for application in the field of education. LLMs possess the capability to interpret knowledge, answer questions, and consider context, thus providing support for dialogic teaching to students. Therefore, an examination of the capacity of LLMs to effectively fulfill instructional roles, thereby facilitating student learning akin to human educators within dialogic teaching scenarios, is an exceptionally valuable research topic. This research recruited 34 undergraduate students as participants, who were randomly divided into two groups. The experimental group engaged in dialogic teaching using ChatGPT, while the control group interacted with human teachers. Both groups learned the histogram equalization unit in the information-related course "Digital Image Processing". The research findings show comparable scores between the two groups on the retention test. However, students who engaged in dialogue with ChatGPT exhibited lower performance on the transfer test. Electroencephalography data revealed that students who interacted with ChatGPT exhibited higher levels of cognitive activity, suggesting that ChatGPT could help students establish a knowledge foundation and stimulate cognitive activity. However, its strengths on promoting students. knowledge application and creativity were insignificant. Based upon the research findings, it is evident that ChatGPT cannot fully excel in fulfilling teaching tasks in the dialogue teaching in information related courses. Combining ChatGPT with traditional human teachers might be a more ideal approach. The synergistic use of both can provide students with more comprehensive learning support, thus contributing to enhancing the quality of teaching.

AI · ML · AAAI · 論文 · NeurIPS ·

2024 年 4 月 21 日

Human participants in AI research: Ethics and transparency in practice

Kevin R. McKee

In recent years, research involving human participants has been critical to advances in artificial intelligence (AI) and machine learning (ML), particularly in the areas of conversational, human-compatible, and cooperative AI. For example, around 12% and 6% of publications at recent AAAI and NeurIPS conferences indicate the collection of original human data, respectively. Yet AI and ML researchers lack guidelines for ethical, transparent research practices with human participants. Fewer than one out of every four of these AAAI and NeurIPS papers provide details of ethical review, the collection of informed consent, or participant compensation. This paper aims to bridge this gap by exploring normative similarities and differences between AI research and related fields that involve human participants. Though psychology, human-computer interaction, and other adjacent fields offer historic lessons and helpful insights, AI research raises several specific concerns$\unicode{x2014}$namely, participatory design, crowdsourced dataset development, and an expansive role of corporations$\unicode{x2014}$that necessitate a contextual ethics framework. To address these concerns, this paper outlines a set of guidelines for ethical and transparent practice with human participants in AI and ML research. These guidelines can be found in Section 4 on pp. 4$\unicode{x2013}$7.

AI · MoDELS · 任務對話系統 · 可理解性 · 值域 ·

2024 年 4 月 20 日

Now, Later, and Lasting: Ten Priorities for AI Research, Policy, and Practice

Eric Horvitz,Vincent Conitzer,Sheila McIlraith,Peter Stone

from arxiv, Four pages. To appear in Communications of the Association for Computing Machinery (CACM), June 2024

Advances in artificial intelligence (AI) will transform many aspects of our lives and society, bringing immense opportunities but also posing significant risks and challenges. The next several decades may well be a turning point for humanity, comparable to the industrial revolution. We write to share a set of recommendations for moving forward from the perspective of the founder and leaders of the One Hundred Year Study on AI. Launched a decade ago, the project is committed to a perpetual series of studies by multidisciplinary experts to evaluate the immediate, longer-term, and far-reaching effects of AI on people and society, and to make recommendations about AI research, policy, and practice. As we witness new capabilities emerging from neural models, it is crucial that we engage in efforts to advance our scientific understanding of these models and their behaviors. We must address the impact of AI on people and society through technical, social, and sociotechnical lenses, incorporating insights from a diverse range of experts including voices from engineering, social, behavioral, and economic disciplines. By fostering dialogue, collaboration, and action among various stakeholders, we can strategically guide the development and deployment of AI in ways that maximize its potential for contributing to human flourishing. Despite the growing divide in the field between focusing on short-term versus long-term implications, we think both are of critical importance. As Alan Turing, one of the pioneers of AI, wrote in 1950, "We can only see a short distance ahead, but we can see plenty there that needs to be done." We offer ten recommendations for action that collectively address both the short- and long-term potential impacts of AI technologies.

查準率/準確率 · 控制器 · GROUP · 條件獨立的 · 相互獨立的 ·

2024 年 4 月 19 日

Catch me if you can: Signal localization with knockoff e-values

Paula Gablenz,Chiara Sabatti

from arxiv, 48 pages, 12 figures; text edits (incl. abstract, appendix, additional remarks), added references

We consider problems where many, somewhat redundant, hypotheses are tested and we are interested in reporting the most precise rejections, with false discovery rate (FDR) control. This is the case, for example, when researchers are interested both in individual hypotheses as well as group hypotheses corresponding to intersections of sets of the original hypotheses, at several resolution levels. A concrete application is in genome-wide association studies, where, depending on the signal strengths, it might be possible to resolve the influence of individual genetic variants on a phenotype with greater or lower precision. To adapt to the unknown signal strength, analyses are conducted at multiple resolutions and researchers are most interested in the more precise discoveries. Assuring FDR control on the reported findings with these adaptive searches is, however, often impossible. To design a multiple comparison procedure that allows for an adaptive choice of resolution with FDR control, we leverage e-values and linear programming. We adapt this approach to problems where knockoffs and group knockoffs have been successfully applied to test conditional independence hypotheses. We demonstrate its efficacy by analyzing data from the UK Biobank.

知識 (knowledge) · ChatGPT · 語言模型化 · 大語言模型 · 可辨認的 ·

2024 年 4 月 19 日

ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models

Ning Bian,Xianpei Han,Le Sun,Hongyu Lin,Yaojie Lu,Ben He,Shanshan Jiang,Bin Dong

from arxiv, Accepted by LREC-COLING 2024

Large language models (LLMs) have made significant progress in NLP. However, their ability to memorize, represent, and leverage commonsense knowledge has been a well-known pain point. In this paper, we specifically focus on ChatGPT, a widely used and easily accessible LLM, and ask the following questions: (1) Can ChatGPT effectively answer commonsense questions? (2) Is ChatGPT aware of the underlying commonsense knowledge for answering a specific question? (3) Is ChatGPT knowledgeable in commonsense? (4) Can ChatGPT effectively leverage commonsense for answering questions? We conduct a series of experiments on 11 datasets to evaluate ChatGPT's commonsense abilities, including answering commonsense questions, identifying necessary knowledge, generating knowledge descriptions, and using knowledge descriptions to answer questions again. Experimental results show that: (1) ChatGPT can achieve good QA accuracies in commonsense tasks, while still struggling with certain domains of datasets. (2) ChatGPT is knowledgeable, and can accurately generate most of the commonsense knowledge using knowledge prompts. (3) Despite its knowledge, ChatGPT is an inexperienced commonsense problem solver, which cannot precisely identify the needed commonsense for answering a specific question. These findings raise the need to explore improved mechanisms for effectively incorporating commonsense into LLMs like ChatGPT, such as better instruction following and commonsense guidance.

Taxonomy · Integration · 知識 (knowledge) · MoDELS · Performer ·

2023 年 11 月 10 日

Trends in Integration of Knowledge and Large Language Models: A Survey and Taxonomy of Methods, Benchmarks, and Applications

Zhangyin Feng,Weitao Ma,Weijiang Yu,Lei Huang,Haotian Wang,Qianglong Chen,Weihua Peng,Xiaocheng Feng,Bing Qin,Ting liu

from arxiv, Work in progress; 22 pages

Large language models (LLMs) exhibit superior performance on various natural language tasks, but they are susceptible to issues stemming from outdated data and domain-specific limitations. In order to address these challenges, researchers have pursued two primary strategies, knowledge editing and retrieval augmentation, to enhance LLMs by incorporating external information from different aspects. Nevertheless, there is still a notable absence of a comprehensive survey. In this paper, we propose a review to discuss the trends in integration of knowledge and large language models, including taxonomy of methods, benchmarks, and applications. In addition, we conduct an in-depth analysis of different methods and point out potential research directions in the future. We hope this survey offers the community quick access and a comprehensive overview of this research area, with the intention of inspiring future research endeavors.

Taxonomy · AIM · MoDELS · 評論員 · 語言模型化 ·

2023 年 11 月 9 日

A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions

Lei Huang,Weijiang Yu,Weitao Ma,Weihong Zhong,Zhangyin Feng,Haotian Wang,Qianglong Chen,Weihua Peng,Xiaocheng Feng,Bing Qin,Ting Liu

from arxiv, Work in progress; 49 pages

The emergence of large language models (LLMs) has marked a significant breakthrough in natural language processing (NLP), leading to remarkable advancements in text understanding and generation. Nevertheless, alongside these strides, LLMs exhibit a critical tendency to produce hallucinations, resulting in content that is inconsistent with real-world facts or user inputs. This phenomenon poses substantial challenges to their practical deployment and raises concerns over the reliability of LLMs in real-world scenarios, which attracts increasing attention to detect and mitigate these hallucinations. In this survey, we aim to provide a thorough and in-depth overview of recent advances in the field of LLM hallucinations. We begin with an innovative taxonomy of LLM hallucinations, then delve into the factors contributing to hallucinations. Subsequently, we present a comprehensive overview of hallucination detection methods and benchmarks. Additionally, representative approaches designed to mitigate hallucinations are introduced accordingly. Finally, we analyze the challenges that highlight the current limitations and formulate open questions, aiming to delineate pathways for future research on hallucinations in LLMs.

變換 · Vision · 可辨認的 · Taxonomy · Prompt ·

2022 年 1 月 24 日

Transformers in Medical Imaging: A Survey

Fahad Shamshad,Salman Khan,Syed Waqas Zamir,Muhammad Haris Khan,Munawar Hayat,Fahad Shahbaz Khan,Huazhu Fu

from arxiv, 41 pages, \url{//github.com/fahadshamshad/awesome-transformers-in-medical-imaging}

Following unprecedented success on the natural language tasks, Transformers have been successfully applied to several computer vision problems, achieving state-of-the-art results and prompting researchers to reconsider the supremacy of convolutional neural networks (CNNs) as {de facto} operators. Capitalizing on these advances in computer vision, the medical imaging field has also witnessed growing interest for Transformers that can capture global context compared to CNNs with local receptive fields. Inspired from this transition, in this survey, we attempt to provide a comprehensive review of the applications of Transformers in medical imaging covering various aspects, ranging from recently proposed architectural designs to unsolved issues. Specifically, we survey the use of Transformers in medical image segmentation, detection, classification, reconstruction, synthesis, registration, clinical report generation, and other tasks. In particular, for each of these applications, we develop taxonomy, identify application-specific challenges as well as provide insights to solve them, and highlight recent trends. Further, we provide a critical discussion of the field's current state as a whole, including the identification of key challenges, open problems, and outlining promising future directions. We hope this survey will ignite further interest in the community and provide researchers with an up-to-date reference regarding applications of Transformer models in medical imaging. Finally, to cope with the rapid development in this field, we intend to regularly update the relevant latest papers and their open-source implementations at \url{//github.com/fahadshamshad/awesome-transformers-in-medical-imaging}.

entity · 命名實體識別 · CASE · Extensibility · MoDELS ·

2019 年 11 月 14 日

Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources

Qianhui Wu,Zijia Lin,Guoxin Wang,Hui Chen,B?rje F. Karlsson,Biqing Huang,Chin-Yew Lin

from arxiv, This paper is accepted by AAAI2020

For languages with no annotated resources, transferring knowledge from rich-resource languages is an effective solution for named entity recognition (NER). While all existing methods directly transfer from source-learned model to a target language, in this paper, we propose to fine-tune the learned model with a few similar examples given a test case, which could benefit the prediction by leveraging the structural and semantic information conveyed in such similar examples. To this end, we present a meta-learning algorithm to find a good model parameter initialization that could fast adapt to the given test case and propose to construct multiple pseudo-NER tasks for meta-training by computing sentence similarities. To further improve the model's generalization ability across different languages, we introduce a masking scheme and augment the loss function with an additional maximum term during meta-training. We conduct extensive experiments on cross-lingual named entity recognition with minimal resources over five target languages. The results show that our approach significantly outperforms existing state-of-the-art methods across the board.