唯美清纯另类亚洲一区二区-WWW国产亚洲精品久久久日本

Recent advances in large language models (LLMs) have opened up new paradigms for accessing the knowledge stored in their parameters. One critical challenge that has emerged is the presence of hallucinations in LLM outputs due to false or outdated knowledge. Since retraining LLMs with updated information is resource-intensive, there has been a growing interest in model editing. However, many model editing methods, while effective in various scenarios, tend to overemphasize aspects such as efficacy, generalization, and locality in editing performance, often overlooking potential side effects on the general abilities of LLMs. In this paper, we raise concerns that the improvement of model factuality may come at the cost of a significant degradation of these general abilities, which is not conducive to the sustainable development of LLMs. Systematically, we analyze side effects by evaluating four popular editing methods on two LLMs across eight representative task categories. Extensive empirical research reveals that model editing does improve model factuality but at the expense of substantially impairing general abilities. Therefore, we advocate for more research efforts to minimize the loss of general abilities acquired during LLM pre-training and to ultimately preserve them during model editing.

相關內容

大語(yu)言模型

關注 56

大語(yu)(yu)(yu)言(yan)(yan)(yan)模(mo)(mo)型(xing)是基于海(hai)量文本數(shu)(shu)據訓練(lian)的(de)(de)深(shen)度學習模(mo)(mo)型(xing)。它不(bu)(bu)(bu)僅能夠(gou)生(sheng)成自然語(yu)(yu)(yu)言(yan)(yan)(yan)文本，還能夠(gou)深(shen)入理解文本含義，處理各種自然語(yu)(yu)(yu)言(yan)(yan)(yan)任(ren)務(wu)，如(ru)文本摘要、問答、翻(fan)譯(yi)等。2023年，大語(yu)(yu)(yu)言(yan)(yan)(yan)模(mo)(mo)型(xing)及(ji)其在人(ren)(ren)工智能領域的(de)(de)應用(yong)已成為全(quan)球科技(ji)(ji)研究(jiu)的(de)(de)熱點，其在規模(mo)(mo)上的(de)(de)增長尤為引(yin)人(ren)(ren)注目(mu)，參數(shu)(shu)量已從最初的(de)(de)十幾億躍升(sheng)到如(ru)今的(de)(de)一(yi)(yi)萬億。參數(shu)(shu)量的(de)(de)提(ti)(ti)升(sheng)使得模(mo)(mo)型(xing)能夠(gou)更(geng)(geng)加精細地捕捉人(ren)(ren)類語(yu)(yu)(yu)言(yan)(yan)(yan)微妙之處，更(geng)(geng)加深(shen)入地理解人(ren)(ren)類語(yu)(yu)(yu)言(yan)(yan)(yan)的(de)(de)復雜性(xing)(xing)。在過去(qu)的(de)(de)一(yi)(yi)年里，大語(yu)(yu)(yu)言(yan)(yan)(yan)模(mo)(mo)型(xing)在吸納新知識、分解復雜任(ren)務(wu)以及(ji)圖文對齊等多方面都有顯著(zhu)提(ti)(ti)升(sheng)。隨著(zhu)技(ji)(ji)術(shu)的(de)(de)不(bu)(bu)(bu)斷(duan)(duan)成熟，它將不(bu)(bu)(bu)斷(duan)(duan)拓展其應用(yong)范圍，為人(ren)(ren)類提(ti)(ti)供更(geng)(geng)加智能化(hua)和(he)個性(xing)(xing)化(hua)的(de)(de)服務(wu)，進一(yi)(yi)步改(gai)善人(ren)(ren)們(men)的(de)(de)生(sheng)活(huo)和(he)生(sheng)產方式(shi)。

推斷 · INFORMS · Performer · 數據集 · 模型評估 ·

2024 年 2 月 22 日

Enhancing Systematic Decompositional Natural Language Inference Using Informal Logic

Nathaniel Weir,Kate Sanders,Orion Weller,Shreya Sharma,Dongwei Jiang,Zhengping Zhang,Bhavana Dalvi Mishra,Oyvind Tafjord,Peter Jansen,Peter Clark,Benjamin Van Durme

Contemporary language models enable new opportunities for structured reasoning with text, such as the construction and evaluation of intuitive, proof-like textual entailment trees without relying on brittle formal logic. However, progress in this direction has been hampered by a long-standing lack of a clear protocol for determining what valid compositional entailment is. This absence causes noisy datasets and limited performance gains by modern neuro-symbolic engines. To address these problems, we formulate a consistent and theoretically grounded approach to annotating decompositional entailment datasets, and evaluate its impact on LLM-based textual inference. We find that our resulting dataset, RDTE (Recognizing Decompositional Textual Entailment), has a substantially higher internal consistency (+9%) than prior decompositional entailment datasets, suggesting that RDTE is a significant step forward in the long-standing problem of forming a clear protocol for discerning entailment. We also find that training an RDTE-oriented entailment classifier via knowledge distillation and employing it in a modern neuro-symbolic reasoning engine significantly improves results (both accuracy and proof quality) over other entailment classifier baselines, illustrating the practical benefit of this advance for textual inference.

Performer · INTERACT · 大語言模型 · ChatGPT · 語言模型化 ·

2024 年 2 月 21 日

Rocks Coding, Not Development--A Human-Centric, Experimental Evaluation of LLM-Supported SE Tasks

Wei Wang,Huilong Ning,Gaowei Zhang,Libo Liu,Yi Wang

from arxiv, The paper has been accepted by FSE

Recently, large language models (LLM) based generative AI has been gaining momentum for their impressive high-quality performances in multiple domains, particularly after the release of the ChatGPT. Many believe that they have the potential to perform general-purpose problem-solving in software development and replace human software developers. Nevertheless, there are in a lack of serious investigation into the capability of these LLM techniques in fulfilling software development tasks. In a controlled 2 x 2 between-subject experiment with 109 participants, we examined whether and to what degree working with ChatGPT was helpful in the coding task and typical software development task and how people work with ChatGPT. We found that while ChatGPT performed well in solving simple coding problems, its performance in supporting typical software development tasks was not that good. We also observed the interactions between participants and ChatGPT and found the relations between the interactions and the outcomes. Our study thus provides first-hand insights into using ChatGPT to fulfill software engineering tasks with real-world developers and motivates the need for novel interaction mechanisms that help developers effectively work with large language models to achieve desired outcomes.

知識 (knowledge) · MoDELS · 大語言模型 · 圖 · 語言模型化 ·

2024 年 2 月 21 日

Knowledge Graph Enhanced Large Language Model Editing

Mengqi Zhang,Xiaotian Ye,Qiang Liu,Pengjie Ren,Shu Wu,Zhumin Chen

Large language models (LLMs) are pivotal in advancing natural language processing (NLP) tasks, yet their efficacy is hampered by inaccuracies and outdated knowledge. Model editing emerges as a promising solution to address these challenges. However, existing editing methods struggle to track and incorporate changes in knowledge associated with edits, which limits the generalization ability of postedit LLMs in processing edited knowledge. To tackle these problems, we propose a novel model editing method that leverages knowledge graphs for enhancing LLM editing, namely GLAME. Specifically, we first utilize a knowledge graph augmentation module to uncover associated knowledge that has changed due to editing, obtaining its internal representations within LLMs. This approach allows knowledge alterations within LLMs to be reflected through an external graph structure. Subsequently, we design a graph-based knowledge edit module to integrate structured knowledge into the model editing. This ensures that the updated parameters reflect not only the modifications of the edited knowledge but also the changes in other associated knowledge resulting from the editing process. Comprehensive experiments conducted on GPT-J and GPT-2 XL demonstrate that GLAME significantly improves the generalization capabilities of post-edit LLMs in employing edited knowledge.

序列標注 · 標注 · MoDELS · Performer · 知識 (knowledge) ·

2024 年 2 月 21 日

An Effective Incorporating Heterogeneous Knowledge Curriculum Learning for Sequence Labeling

Xuemei Tang,Qi Su

from arxiv, 10 pages, 9 tables, 3 figures

Sequence labeling models often benefit from incorporating external knowledge. However, this practice introduces data heterogeneity and complicates the model with additional modules, leading to increased expenses for training a high-performing model. To address this challenge, we propose a two-stage curriculum learning (TCL) framework specifically designed for sequence labeling tasks. The TCL framework enhances training by gradually introducing data instances from easy to hard, aiming to improve both performance and training speed. Furthermore, we explore different metrics for assessing the difficulty levels of sequence labeling tasks. Through extensive experimentation on six Chinese word segmentation (CWS) and Part-of-speech tagging (POS) datasets, we demonstrate the effectiveness of our model in enhancing the performance of sequence labeling models. Additionally, our analysis indicates that TCL accelerates training and alleviates the slow training problem associated with complex models.

MoDELS · 變換 · 語言模型化 · Processing（編程語言） · Subspace ·

2024 年 2 月 20 日

The Hidden Space of Transformer Language Adapters

Jesujoba O. Alabi,Marius Mosbach,Matan Eyal,Dietrich Klakow,Mor Geva

from arxiv, 18 pages

We analyze the operation of transformer language adapters, which are small modules trained on top of a frozen language model to adapt its predictions to new target languages. We show that adapted predictions mostly evolve in the source language the model was trained on, while the target language becomes pronounced only in the very last layers of the model. Moreover, the adaptation process is gradual and distributed across layers, where it is possible to skip small groups of adapters without decreasing adaptation performance. Last, we show that adapters operate on top of the model's frozen representation space while largely preserving its structure, rather than on an 'isolated' subspace. Our findings provide a deeper view into the adaptation process of language models to new languages, showcasing the constraints imposed on it by the underlying model and introduces practical implications to enhance its efficiency.

可理解性 · Attention · Learning · 可辨認的 · Better ·

2024 年 2 月 20 日

Identifying Semantic Induction Heads to Understand In-Context Learning

Jie Ren,Qipeng Guo,Hang Yan,Dongrui Liu,Xipeng Qiu,Dahua Lin

Although large language models (LLMs) have demonstrated remarkable performance, the lack of transparency in their inference logic raises concerns about their trustworthiness. To gain a better understanding of LLMs, we conduct a detailed analysis of the operations of attention heads and aim to better understand the in-context learning of LLMs. Specifically, we investigate whether attention heads encode two types of relationships between tokens present in natural languages: the syntactic dependency parsed from sentences and the relation within knowledge graphs. We find that certain attention heads exhibit a pattern where, when attending to head tokens, they recall tail tokens and increase the output logits of those tail tokens. More crucially, the formulation of such semantic induction heads has a close correlation with the emergence of the in-context learning ability of language models. The study of semantic attention heads advances our understanding of the intricate operations of attention heads in transformers, and further provides new insights into the in-context learning of LLMs.

知識 (knowledge) · MoDELS · 語言模型化 · 大語言模型 · INFORMS ·

2024 年 2 月 20 日

Stable Knowledge Editing in Large Language Models

Zihao Wei,Liang Pang,Hanxing Ding,Jingcheng Deng,Huawei Shen,Xueqi Cheng

Efficient knowledge editing of large language models is crucial for replacing obsolete information or incorporating specialized knowledge on a large scale. However, previous methods implicitly assume that knowledge is localized and isolated within the model, an assumption that oversimplifies the interconnected nature of model knowledge. The premise of localization results in an incomplete knowledge editing, whereas an isolated assumption may impair both other knowledge and general abilities. It introduces instability to the performance of the knowledge editing method. To transcend these assumptions, we introduce StableKE, a method adopts a novel perspective based on knowledge augmentation rather than knowledge localization. To overcome the expense of human labeling, StableKE integrates two automated knowledge augmentation strategies: Semantic Paraphrase Enhancement strategy, which diversifies knowledge descriptions to facilitate the teaching of new information to the model, and Contextual Description Enrichment strategy, expanding the surrounding knowledge to prevent the forgetting of related information. StableKE surpasses other knowledge editing methods, demonstrating stability both edited knowledge and multi-hop knowledge, while also preserving unrelated knowledge and general abilities. Moreover, StableKE can edit knowledge on ChatGPT.

SQL · 自動問答 · 語義分析 · AIM · 可理解性 ·

2024 年 2 月 19 日

Training Table Question Answering via SQL Query Decomposition

Rapha?l Mouravieff,Benjamin Piwowarski,Sylvain Lamprier

Table Question-Answering involves both understanding the natural language query and grounding it in the context of the input table to extract the relevant information. In this context, many methods have highlighted the benefits of intermediate pre-training from SQL queries. However, while most approaches aim at generating final answers from inputs directly, we claim that there is better to do with SQL queries during training. By learning to imitate a restricted portion of SQL-like algebraic operations, we show that their execution flow provides intermediate supervision steps that allow increased generalization and structural reasoning compared with classical approaches of the field. Our study bridges the gap between semantic parsing and direct answering methods and provides useful insights regarding what types of operations should be predicted by a generative architecture or be preferably executed by an external algorithm.

估計/估計量 · contrastive · INFORMS · 互信息 · 表示學習 ·

2021 年 6 月 25 日

Decomposed Mutual Information Estimation for Contrastive Representation Learning

Alessandro Sordoni,Nouha Dziri,Hannes Schulz,Geoff Gordon,Phil Bachman,Remi Tachet

from arxiv, ICML 2021

Recent contrastive representation learning methods rely on estimating mutual information (MI) between multiple views of an underlying context. E.g., we can derive multiple views of a given image by applying data augmentation, or we can split a sequence into views comprising the past and future of some step in the sequence. Contrastive lower bounds on MI are easy to optimize, but have a strong underestimation bias when estimating large amounts of MI. We propose decomposing the full MI estimation problem into a sum of smaller estimation problems by splitting one of the views into progressively more informed subviews and by applying the chain rule on MI between the decomposed views. This expression contains a sum of unconditional and conditional MI terms, each measuring modest chunks of the total MI, which facilitates approximation via contrastive bounds. To maximize the sum, we formulate a contrastive lower bound on the conditional MI which can be approximated efficiently. We refer to our general approach as Decomposed Estimation of Mutual Information (DEMI). We show that DEMI can capture a larger amount of MI than standard non-decomposed contrastive bounds in a synthetic setting, and learns better representations in a vision domain and for dialogue generation.

離散化 · 圖 · 圖形處理器 · Neural Networks · Networking ·

2019 年 3 月 28 日

Learning Discrete Structures for Graph Neural Networks

Luca Franceschi,Mathias Niepert,Massimiliano Pontil,Xiao He

from arxiv, 18 pages

Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.