亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<form id='3g91j'></form>

<bdo id='Yg0L5'><sup id='OWssD'><div id='31hcX'><bdo id='7OP4Q'></bdo></div></sup></bdo>

·

大語言模型 · 語言模型化 · MoDELS · 可理解性 · Performer ·

2023 年 12 月 17 日

Understanding the Instruction Mixture for Large Language Model

Renxi Wang,Minghao Wu,Yuxia Wang,Xudong Han,Chiyu Zhang,Haonan Li

from arxiv, Instruction Tuning, Large Language Model, Alignment

While instructions fine-tuning of large language models (LLMs) has been proven to enhance performance across various applications, the influence of the instruction dataset mixture on LLMs has not been thoroughly explored. In this study, we classify instructions into three main types: NLP downstream tasks, coding, and general chatting, and investigate their impact on LLMs. Our findings reveal that specific types of instructions are more beneficial for particular uses, while it may cause harms to other aspects, emphasizing the importance of meticulously designing the instruction mixture to maximize model performance. This study sheds light on the instruction mixture and paves the way for future research.

相關內容

大語言模型

大語言模型

大語(yu)(yu)言(yan)模型是基于海量文(wen)本數(shu)(shu)據訓(xun)練(lian)的深(shen)度學習模型。它(ta)不僅能夠生成自(zi)然(ran)語(yu)(yu)言(yan)文(wen)本，還能夠深(shen)入理解文(wen)本含義(yi)，處(chu)理各(ge)種自(zi)然(ran)語(yu)(yu)言(yan)任(ren)務(wu)，如(ru)文(wen)本摘(zhai)要、問答、翻譯等。2023年，大語(yu)(yu)言(yan)模型及(ji)其(qi)在人(ren)(ren)工(gong)智(zhi)能領(ling)域的應用已成為(wei)全球(qiu)科技研究的熱(re)點，其(qi)在規模上(shang)的增長(chang)尤為(wei)引人(ren)(ren)注目，參數(shu)(shu)量已從最初的十幾億(yi)躍升到(dao)如(ru)今的一(yi)萬(wan)億(yi)。參數(shu)(shu)量的提升使得模型能夠更(geng)加(jia)精細地(di)捕捉(zhuo)人(ren)(ren)類(lei)語(yu)(yu)言(yan)微(wei)妙之處(chu)，更(geng)加(jia)深(shen)入地(di)理解人(ren)(ren)類(lei)語(yu)(yu)言(yan)的復(fu)雜性(xing)。在過去的一(yi)年里，大語(yu)(yu)言(yan)模型在吸納新(xin)知(zhi)識(shi)、分(fen)解復(fu)雜任(ren)務(wu)以及(ji)圖文(wen)對齊(qi)等多方面都有顯著(zhu)提升。隨著(zhu)技術的不斷成熟，它(ta)將不斷拓展(zhan)其(qi)應用范圍，為(wei)人(ren)(ren)類(lei)提供更(geng)加(jia)智(zhi)能化(hua)和個性(xing)化(hua)的服務(wu)，進(jin)一(yi)步改善(shan)人(ren)(ren)們的生活和生產方式(shi)。

Agent · 大語言模型 · 回合 · 可辨認的 · MoDELS ·

2024 年 2 月 6 日

Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science

Xiangru Tang,Qiao Jin,Kunlun Zhu,Tongxin Yuan,Yichi Zhang,Wangchunshu Zhou,Meng Qu,Yilun Zhao,Jian Tang,Zhuosheng Zhang,Arman Cohan,Zhiyong Lu,Mark Gerstein

Intelligent agents powered by large language models (LLMs) have demonstrated substantial promise in autonomously conducting experiments and facilitating scientific discoveries across various disciplines. While their capabilities are promising, they also introduce novel vulnerabilities that demand careful consideration for safety. However, there exists a notable gap in the literature, as there has been no comprehensive exploration of these vulnerabilities. This position paper fills this gap by conducting a thorough examination of vulnerabilities in LLM-based agents within scientific domains, shedding light on potential risks associated with their misuse and emphasizing the need for safety measures. We begin by providing a comprehensive overview of the potential risks inherent to scientific LLM agents, taking into account user intent, the specific scientific domain, and their potential impact on the external environment. Then, we delve into the origins of these vulnerabilities and provide a scoping review of the limited existing works. Based on our analysis, we propose a triadic framework involving human regulation, agent alignment, and an understanding of environmental feedback (agent regulation) to mitigate these identified risks. Furthermore, we highlight the limitations and challenges associated with safeguarding scientific agents and advocate for the development of improved models, robust benchmarks, and comprehensive regulations to address these issues effectively.

語言模型化 · INTERACT · MoDELS · 可理解性 · CASES ·

2024 年 2 月 6 日

Empowering Language Models with Active Inquiry for Deeper Understanding

Jing-Cheng Pang,Heng-Bo Fan,Pengyuan Wang,Jia-Hao Xiao,Nan Tang,Si-Hang Yang,Chengxing Jia,Sheng-Jun Huang,Yang Yu

The rise of large language models (LLMs) has revolutionized the way that we interact with artificial intelligence systems through natural language. However, LLMs often misinterpret user queries because of their uncertain intention, leading to less helpful responses. In natural human interactions, clarification is sought through targeted questioning to uncover obscure information. Thus, in this paper, we introduce LaMAI (Language Model with Active Inquiry), designed to endow LLMs with this same level of interactive engagement. LaMAI leverages active learning techniques to raise the most informative questions, fostering a dynamic bidirectional dialogue. This approach not only narrows the contextual gap but also refines the output of the LLMs, aligning it more closely with user expectations. Our empirical studies, across a variety of complex datasets where LLMs have limited conversational context, demonstrate the effectiveness of LaMAI. The method improves answer accuracy from 31.9% to 50.9%, outperforming other leading question-answering frameworks. Moreover, in scenarios involving human participants, LaMAI consistently generates responses that are superior or comparable to baseline methods in more than 82% of the cases. The applicability of LaMAI is further evidenced by its successful integration with various LLMs, highlighting its potential for the future of interactive language models.

語言模型化 · MoDELS · 詞元分析器 · Performer · 數據驅動的方法 ·

2024 年 2 月 5 日

Guiding Language Model Math Reasoning with Planning Tokens

Xinyi Wang,Lucas Caccia,Oleksiy Ostapenko,Xingdi Yuan,William Yang Wang,Alessandro Sordoni

Large language models (LLMs) have recently attracted considerable interest for their ability to perform complex reasoning tasks, such as chain-of-thought reasoning. However, most of the existing approaches to enhance this ability rely heavily on data-driven methods, while neglecting the structural aspects of the model's reasoning capacity. We find that while LLMs can manage individual reasoning steps well, they struggle with maintaining consistency across an entire reasoning chain. To solve this, we introduce planning tokens at the start of each reasoning step, serving as a guide for the model, and add their embeddings to the model parameters. Our approach requires a negligible increase in trainable parameters (just 0.001%) and can be applied through either full fine-tuning or a more parameter-efficient scheme. We demonstrate our method's effectiveness by applying it to three different LLMs, showing notable accuracy improvements across three math word problem datasets w.r.t. standard fine-tuning baselines.

大語言模型 · MoDELS · 語言模型化 · Prompt · Processing（編程語言） ·

2024 年 2 月 5 日

Fundamental Limitations of Alignment in Large Language Models

Yotam Wolf,Noam Wies,Oshri Avnery,Yoav Levine,Amnon Shashua

An important aspect in developing language models that interact with humans is aligning their behavior to be useful and unharmful for their human users. This is usually achieved by tuning the model in a way that enhances desired behaviors and inhibits undesired ones, a process referred to as alignment. In this paper, we propose a theoretical approach called Behavior Expectation Bounds (BEB) which allows us to formally investigate several inherent characteristics and limitations of alignment in large language models. Importantly, we prove that within the limits of this framework, for any behavior that has a finite probability of being exhibited by the model, there exist prompts that can trigger the model into outputting this behavior, with probability that increases with the length of the prompt. This implies that any alignment process that attenuates an undesired behavior but does not remove it altogether, is not safe against adversarial prompting attacks. Furthermore, our framework hints at the mechanism by which leading alignment approaches such as reinforcement learning from human feedback make the LLM prone to being prompted into the undesired behaviors. This theoretical result is being experimentally demonstrated in large scale by the so called contemporary "chatGPT jailbreaks", where adversarial users trick the LLM into breaking its alignment guardrails by triggering it into acting as a malicious persona. Our results expose fundamental limitations in alignment of LLMs and bring to the forefront the need to devise reliable mechanisms for ensuring AI safety.

向量化 · 語言模型化 · 大語言模型 · MoDELS · 層 ·

2024 年 2 月 2 日

Style Vectors for Steering Generative Large Language Model

Kai Konen,Sophie Jentzsch,Diaoulé Diallo,Peer Schütt,Oliver Bensch,Roxanne El Baff,Dominik Opitz,Tobias Hecking

from arxiv, Will be published as findings paper at EACL2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics

This research explores strategies for steering the output of large language models (LLMs) towards specific styles, such as sentiment, emotion, or writing style, by adding style vectors to the activations of hidden layers during text generation. We show that style vectors can be simply computed from recorded layer activations for input texts in a specific style in contrast to more complex training-based approaches. Through a series of experiments, we demonstrate the effectiveness of activation engineering using such style vectors to influence the style of generated text in a nuanced and parameterisable way, distinguishing it from prompt engineering. The presented research constitutes a significant step towards developing more adaptive and effective AI-empowered interactive systems.

ACM Multimedia · MoDELS · AIM · Integration · 可辨認的 ·

2024 年 2 月 2 日

Detecting Multimedia Generated by Large AI Models: A Survey

Li Lin,Neeraj Gupta,Yue Zhang,Hainan Ren,Chun-Hao Liu,Feng Ding,Xin Wang,Xin Li,Luisa Verdoliva,Shu Hu

The rapid advancement of Large AI Models (LAIMs), particularly diffusion models and large language models, has marked a new era where AI-generated multimedia is increasingly integrated into various aspects of daily life. Although beneficial in numerous fields, this content presents significant risks, including potential misuse, societal disruptions, and ethical concerns. Consequently, detecting multimedia generated by LAIMs has become crucial, with a marked rise in related research. Despite this, there remains a notable gap in systematic surveys that focus specifically on detecting LAIM-generated multimedia. Addressing this, we provide the first survey to comprehensively cover existing research on detecting multimedia (such as text, images, videos, audio, and multimodal content) created by LAIMs. Specifically, we introduce a novel taxonomy for detection methods, categorized by media modality, and aligned with two perspectives: pure detection (aiming to enhance detection performance) and beyond detection (adding attributes like generalizability, robustness, and interpretability to detectors). Additionally, we have presented a brief overview of generation mechanisms, public datasets, and online detection tools to provide a valuable resource for researchers and practitioners in this field. Furthermore, we identify current challenges in detection and propose directions for future research that address unexplored, ongoing, and emerging issues in detecting multimedia generated by LAIMs. Our aim for this survey is to fill an academic gap and contribute to global AI security efforts, helping to ensure the integrity of information in the digital realm. The project link is //github.com/Purdue-M2/Detect-LAIM-generated-Multimedia-Survey.

語言模型化 · Taxonomy · MoDELS · motivation · 評論員 ·

2023 年 5 月 31 日

Beyond One-Model-Fits-All: A Survey of Domain Specialization for Large Language Models

Chen Ling,Xujiang Zhao,Jiaying Lu,Chengyuan Deng,Can Zheng,Junxiang Wang,Tanmoy Chowdhury,Yun Li,Hejie Cui,Xuchao Zhang,Tianjiao Zhao,Amit Panalkar,Wei Cheng,Haoyu Wang,Yanchi Liu,Zhengzhang Chen,Haifeng Chen,Chris White,Quanquan Gu,Carl Yang,Liang Zhao

Large language models (LLMs) have significantly advanced the field of natural language processing (NLP), providing a highly useful, task-agnostic foundation for a wide range of applications. The great promise of LLMs as general task solvers motivated people to extend their functionality largely beyond just a ``chatbot'', and use it as an assistant or even replacement for domain experts and tools in specific domains such as healthcare, finance, and education. However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles, caused by the heterogeneity of domain data, the sophistication of domain knowledge, the uniqueness of domain objectives, and the diversity of the constraints (e.g., various social norms, cultural conformity, religious beliefs, and ethical standards in the domain applications). To fill such a gap, explosively-increase research, and practices have been conducted in very recent years on the domain specialization of LLMs, which, however, calls for a comprehensive and systematic review to better summarizes and guide this promising domain. In this survey paper, first, we propose a systematic taxonomy that categorizes the LLM domain-specialization techniques based on the accessibility to LLMs and summarizes the framework for all the subcategories as well as their relations and differences to each other. We also present a comprehensive taxonomy of critical application domains that can benefit from specialized LLMs, discussing their practical significance and open challenges. Furthermore, we offer insights into the current research status and future trends in this area.

語言模型化 · Performer · Agent · MoDELS · Learning ·

2023 年 5 月 19 日

Introspective Tips: Large Language Model for In-Context Decision Making

Liting Chen,Lu Wang,Hang Dong,Yali Du,Jie Yan,Fangkai Yang,Shuang Li,Pu Zhao,Si Qin,Saravan Rajmohan,Qingwei Lin,Dongmei Zhang

from arxiv, 22 pages, 4 figures

The emergence of large language models (LLMs) has substantially influenced natural language processing, demonstrating exceptional results across various tasks. In this study, we employ ``Introspective Tips" to facilitate LLMs in self-optimizing their decision-making. By introspectively examining trajectories, LLM refines its policy by generating succinct and valuable tips. Our method enhances the agent's performance in both few-shot and zero-shot learning situations by considering three essential scenarios: learning from the agent's past experiences, integrating expert demonstrations, and generalizing across diverse games. Importantly, we accomplish these improvements without fine-tuning the LLM parameters; rather, we adjust the prompt to generalize insights from the three aforementioned situations. Our framework not only supports but also emphasizes the advantage of employing LLM in in-contxt decision-making. Experiments involving over 100 games in TextWorld illustrate the superior performance of our approach.

同義詞集 · 知識庫 · 基 · INFORMS · 極小點 ·

2019 年 12 月 4 日

Towards Building a Multilingual Sememe Knowledge Base: Predicting Sememes for BabelNet Synsets

Fanchao Qi,Liang Chang,Maosong Sun,Sicong Ouyang,Zhiyuan Liu

from arxiv, Accepted by AAAI Conference on Artificial Intelligence 2020 for oral presentation

A sememe is defined as the minimum semantic unit of human languages. Sememe knowledge bases (KBs), which contain words annotated with sememes, have been successfully applied to many NLP tasks. However, existing sememe KBs are built on only a few languages, which hinders their widespread utilization. To address the issue, we propose to build a unified sememe KB for multiple languages based on BabelNet, a multilingual encyclopedic dictionary. We first build a dataset serving as the seed of the multilingual sememe KB. It manually annotates sememes for over $15$ thousand synsets (the entries of BabelNet). Then, we present a novel task of automatic sememe prediction for synsets, aiming to expand the seed dataset into a usable KB. We also propose two simple and effective models, which exploit different information of synsets. Finally, we conduct quantitative and qualitative analyses to explore important factors and difficulties in the task. All the source code and data of this work can be obtained on //github.com/thunlp/BabelNet-Sememe-Prediction.

命名實體識別 · entity · 學成 · 深度學習 · 可辨認的 ·

2018 年 12 月 22 日

A Survey on Deep Learning for Named Entity Recognition

Jing Li,Aixin Sun,Jianglei Han,Chenliang Li

from arxiv, 20 pages, 15 figures

Named entity recognition (NER) is the task to identify text spans that mention named entities, and to classify them into predefined categories such as person, location, organization etc. NER serves as the basis for a variety of natural language applications such as question answering, text summarization, and machine translation. Although early NER systems are successful in producing decent recognition accuracy, they often require much human effort in carefully designing rules or features. In recent years, deep learning, empowered by continuous real-valued vector representations and semantic composition through nonlinear processing, has been employed in NER systems, yielding stat-of-the-art performance. In this paper, we provide a comprehensive review on existing deep learning techniques for NER. We first introduce NER resources, including tagged NER corpora and off-the-shelf NER tools. Then, we systematically categorize existing works based on a taxonomy along three axes: distributed representations for input, context encoder, and tag decoder. Next, we survey the most representative methods for recent applied techniques of deep learning in new NER problem settings and applications. Finally, we present readers with the challenges faced by NER systems and outline future directions in this area.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

大語言模型

語言模型化(hua)

可理(li)解性(xing)

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191