欧美精品日韩精品国内精品,日韩国产一区二区三区在线

from arxiv, Accepted to Proceedings of ICLR 2024. 9 pages for main paper, 38 pages including appendix. The code, results, dataset for this paper and more resources on "LLMs Meet Misinformation" have been released on the project website: //llm-misinformation.github.io/

The advent of Large Language Models (LLMs) has made a transformative impact. However, the potential that LLMs such as ChatGPT can be exploited to generate misinformation has posed a serious concern to online safety and public trust. A fundamental research question is: will LLM-generated misinformation cause more harm than human-written misinformation? We propose to tackle this question from the perspective of detection difficulty. We first build a taxonomy of LLM-generated misinformation. Then we categorize and validate the potential real-world methods for generating misinformation with LLMs. Then, through extensive empirical investigation, we discover that LLM-generated misinformation can be harder to detect for humans and detectors compared to human-written misinformation with the same semantics, which suggests it can have more deceptive styles and potentially cause more harm. We also discuss the implications of our discovery on combating misinformation in the age of LLMs and the countermeasures.

知識薈萃

精品入門和進階教程、論文和代碼整理等

查看相關VIP內容、論文、資訊等

GNN · 情景 · 圖 · 相同 · Learning ·

2024 年 5 月 19 日

Are Targeted Messages More Effective?

Martin Grohe,Eran Rosenbluth

Graph neural networks (GNN) are deep learning architectures for graphs. Essentially, a GNN is a distributed message passing algorithm, which is controlled by parameters learned from data. It operates on the vertices of a graph: in each iteration, vertices receive a message on each incoming edge, aggregate these messages, and then update their state based on their current state and the aggregated messages. The expressivity of GNNs can be characterised in terms of certain fragments of first-order logic with counting and the Weisfeiler-Lehman algorithm. The core GNN architecture comes in two different versions. In the first version, a message only depends on the state of the source vertex, whereas in the second version it depends on the states of the source and target vertices. In practice, both of these versions are used, but the theory of GNNs so far mostly focused on the first one. On the logical side, the two versions correspond to two fragments of first-order logic with counting that we call modal and guarded. The question whether the two versions differ in their expressivity has been mostly overlooked in the GNN literature and has only been asked recently (Grohe, LICS'23). We answer this question here. It turns out that the answer is not as straightforward as one might expect. By proving that the modal and guarded fragment of first-order logic with counting have the same expressivity over labelled undirected graphs, we show that in a non-uniform setting the two GNN versions have the same expressivity. However, we also prove that in a uniform setting the second version is strictly more expressive.

服務器 · Learning · Performer · 聯邦學習 · Processing（編程語言） ·

2024 年 5 月 18 日

Accelerating Hybrid Federated Learning Convergence under Partial Participation

Jieming Bian,Lei Wang,Kun Yang,Cong Shen,Jie Xu

from arxiv, Accepted by IEEE Transactions on Signal Processing, Update the convergence analysis and add more experiment results

Over the past few years, Federated Learning (FL) has become a popular distributed machine learning paradigm. FL involves a group of clients with decentralized data who collaborate to learn a common model under the coordination of a centralized server, with the goal of protecting clients' privacy by ensuring that local datasets never leave the clients and that the server only performs model aggregation. However, in realistic scenarios, the server may be able to collect a small amount of data that approximately mimics the population distribution and has stronger computational ability to perform the learning process. To address this, we focus on the hybrid FL framework in this paper. While previous hybrid FL work has shown that the alternative training of clients and server can increase convergence speed, it has focused on the scenario where clients fully participate and ignores the negative effect of partial participation. In this paper, we provide theoretical analysis of hybrid FL under clients' partial participation to validate that partial participation is the key constraint on convergence speed. We then propose a new algorithm called FedCLG, which investigates the two-fold role of the server in hybrid FL. Firstly, the server needs to process the training steps using its small amount of local datasets. Secondly, the server's calculated gradient needs to guide the participated clients' training and the server's aggregation. We validate our theoretical findings through numerical experiments, which show that our proposed method FedCLG outperforms state-of-the-art methods.

可理解性 · 大語言模型 · 語言模型化 · 詞元分析器 · MoDELS ·

2024 年 5 月 17 日

How Can Large Language Models Understand Spatial-Temporal Data?

Lei Liu,Shuo Yu,Runze Wang,Zhenxun Ma,Yanming Shen

While Large Language Models (LLMs) dominate tasks like natural language processing and computer vision, harnessing their power for spatial-temporal forecasting remains challenging. The disparity between sequential text and complex spatial-temporal data hinders this application. To address this issue, this paper introduces STG-LLM, an innovative approach empowering LLMs for spatial-temporal forecasting. We tackle the data mismatch by proposing: 1) STG-Tokenizer: This spatial-temporal graph tokenizer transforms intricate graph data into concise tokens capturing both spatial and temporal relationships; 2) STG-Adapter: This minimalistic adapter, consisting of linear encoding and decoding layers, bridges the gap between tokenized data and LLM comprehension. By fine-tuning only a small set of parameters, it can effectively grasp the semantics of tokens generated by STG-Tokenizer, while preserving the original natural language understanding capabilities of LLMs. Extensive experiments on diverse spatial-temporal benchmark datasets show that STG-LLM successfully unlocks LLM potential for spatial-temporal forecasting. Remarkably, our approach achieves competitive performance on par with dedicated SOTA methods.

語言模型化 · MoDELS · 代碼 · 大語言模型 · Performer ·

2024 年 5 月 14 日

Can Large Language Models Write Parallel Code?

Daniel Nichols,Joshua H. Davis,Zhaojun Xie,Arjun Rajaram,Abhinav Bhatele

Large language models are increasingly becoming a popular tool for software development. Their ability to model and generate source code has been demonstrated in a variety of contexts, including code completion, summarization, translation, and lookup. However, they often struggle to generate code for complex programs. In this paper, we study the capabilities of state-of-the-art language models to generate parallel code. In order to evaluate language models, we create a benchmark, ParEval, consisting of prompts that represent 420 different coding tasks related to scientific and parallel computing. We use ParEval to evaluate the effectiveness of several state-of-the-art open- and closed-source language models on these tasks. We introduce novel metrics for evaluating the performance of generated code, and use them to explore how well each large language model performs for 12 different computational problem types and six different parallel programming models.

線性的 · 連結 · 代碼 · binary · 相互獨立的 ·

2024 年 5 月 14 日

When Do Low-Rate Concatenated Codes Approach The Gilbert-Varshamov Bound?

Dean Doron,Jonathan Mosheiff,Mary Wootters

The Gilbert--Varshamov (GV) bound is a classical existential result in coding theory. It implies that a random linear binary code of rate $\epsilon^2$ has relative distance at least $\frac{1}{2} - O(\epsilon)$ with high probability. However, it is a major challenge to construct explicit codes with similar parameters. One hope to derandomize the Gilbert--Varshamov construction is with code concatenation: We begin with a (hopefully explicit) outer code ${C}_\mathrm{out}$ over a large alphabet, and concatenate that with a small binary random linear code ${C}_\mathrm{in}$. It is known that when we use \emph{independent} small codes for each coordinate, then the result lies on the GV bound with high probability, but this still uses a lot of randomness. In this paper, we consider the question of whether code concatenation with a single random linear inner code ${C}_\mathrm{in}$ can lie on the GV bound; and if so what conditions on ${C}_\mathrm{out}$ are sufficient for this. We show that first, there do exist linear outer codes ${C}_\mathrm{out}$ that are "good" for concatenation in this sense (in fact, most linear codes codes are good). We also provide two sufficient conditions for ${C}_\mathrm{out}$, so that if ${C}_\mathrm{out}$ satisfies these, ${C}_\mathrm{out}\circ {C}_\mathrm{in}$ will likely lie on the GV bound. We hope that these conditions may inspire future work towards constructing explicit codes ${C}_\mathrm{out}$.

Facebook AI Research · MoDELS · 有偏 · 決策樹 · 訓練數據 ·

2024 年 5 月 14 日

Are Models Trained on Indian Legal Data Fair?

Sahil Girhepuje,Anmol Goel,Gokul S Krishnan,Shreya Goyal,Satyendra Pandey,Ponnurangam Kumaraguru,Balaraman Ravindran

from arxiv, Presented at the Symposium on AI and Law (SAIL) 2023

Recent advances and applications of language technology and artificial intelligence have enabled much success across multiple domains like law, medical and mental health. AI-based Language Models, like Judgement Prediction, have recently been proposed for the legal sector. However, these models are strife with encoded social biases picked up from the training data. While bias and fairness have been studied across NLP, most studies primarily locate themselves within a Western context. In this work, we present an initial investigation of fairness from the Indian perspective in the legal domain. We highlight the propagation of learnt algorithmic biases in the bail prediction task for models trained on Hindi legal documents. We evaluate the fairness gap using demographic parity and show that a decision tree model trained for the bail prediction task has an overall fairness disparity of 0.237 between input features associated with Hindus and Muslims. Additionally, we highlight the need for further research and studies in the avenues of fairness/bias in applying AI in the legal sector with a specific focus on the Indian context.

Better · tuning · 類別 · Prompt · Performer ·

2024 年 5 月 13 日

Can Better Text Semantics in Prompt Tuning Improve VLM Generalization?

Hari Chandana Kuchibhotla,Sai Srinivas Kancheti,Abbavaram Gowtham Reddy,Vineeth N Balasubramanian

Going beyond mere fine-tuning of vision-language models (VLMs), learnable prompt tuning has emerged as a promising, resource-efficient alternative. Despite their potential, effectively learning prompts faces the following challenges: (i) training in a low-shot scenario results in overfitting, limiting adaptability and yielding weaker performance on newer classes or datasets; (ii) prompt-tuning's efficacy heavily relies on the label space, with decreased performance in large class spaces, signaling potential gaps in bridging image and class concepts. In this work, we ask the question if better text semantics can help address these concerns. In particular, we introduce a prompt-tuning method that leverages class descriptions obtained from large language models (LLMs). Our approach constructs part-level description-guided views of both image and text features, which are subsequently aligned to learn more generalizable prompts. Our comprehensive experiments, conducted across 11 benchmark datasets, outperform established methods, demonstrating substantial improvements.

知識 (knowledge) · MoDELS · 樣例 · 語言模型化 · 大語言模型 ·

2024 年 5 月 13 日

Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?

Zorik Gekhman,Gal Yona,Roee Aharoni,Matan Eyal,Amir Feder,Roi Reichart,Jonathan Herzig

When large language models are aligned via supervised fine-tuning, they may encounter new factual information that was not acquired through pre-training. It is often conjectured that this can teach the model the behavior of hallucinating factually incorrect responses, as the model is trained to generate facts that are not grounded in its pre-existing knowledge. In this work, we study the impact of such exposure to new knowledge on the capability of the fine-tuned model to utilize its pre-existing knowledge. To this end, we design a controlled setup, focused on closed-book QA, where we vary the proportion of the fine-tuning examples that introduce new knowledge. We demonstrate that large language models struggle to acquire new factual knowledge through fine-tuning, as fine-tuning examples that introduce new knowledge are learned significantly slower than those consistent with the model's knowledge. However, we also find that as the examples with new knowledge are eventually learned, they linearly increase the model's tendency to hallucinate. Taken together, our results highlight the risk in introducing new factual knowledge through fine-tuning, and support the view that large language models mostly acquire factual knowledge through pre-training, whereas fine-tuning teaches them to use it more efficiently.

MoDELS · 語言模型化 · Performer · SimPLe · 值域 ·

2024 年 5 月 13 日

Can Language Models Explain Their Own Classification Behavior?

Dane Sherburn,Bilal Chughtai,Owain Evans

Large language models (LLMs) perform well at a myriad of tasks, but explaining the processes behind this performance is a challenge. This paper investigates whether LLMs can give faithful high-level explanations of their own internal processes. To explore this, we introduce a dataset, ArticulateRules, of few-shot text-based classification tasks generated by simple rules. Each rule is associated with a simple natural-language explanation. We test whether models that have learned to classify inputs competently (both in- and out-of-distribution) are able to articulate freeform natural language explanations that match their classification behavior. Our dataset can be used for both in-context and finetuning evaluations. We evaluate a range of LLMs, demonstrating that articulation accuracy varies considerably between models, with a particularly sharp increase from GPT-3 to GPT-4. We then investigate whether we can improve GPT-3's articulation accuracy through a range of methods. GPT-3 completely fails to articulate 7/10 rules in our test, even after additional finetuning on correct explanations. We release our dataset, ArticulateRules, which can be used to test self-explanation for LLMs trained either in-context or by finetuning.

置信度 · MoDELS · Extensibility · 圖 · entity ·

2019 年 2 月 26 日

Embedding Uncertain Knowledge Graphs

Xuelu Chen,Muhao Chen,Weijia Shi,Yizhou Sun,Carlo Zaniolo

Embedding models for deterministic Knowledge Graphs (KG) have been extensively studied, with the purpose of capturing latent semantic relations between entities and incorporating the structured knowledge into machine learning. However, there are many KGs that model uncertain knowledge, which typically model the inherent uncertainty of relations facts with a confidence score, and embedding such uncertain knowledge represents an unresolved challenge. The capturing of uncertain knowledge will benefit many knowledge-driven applications such as question answering and semantic search by providing more natural characterization of the knowledge. In this paper, we propose a novel uncertain KG embedding model UKGE, which aims to preserve both structural and uncertainty information of relation facts in the embedding space. Unlike previous models that characterize relation facts with binary classification techniques, UKGE learns embeddings according to the confidence scores of uncertain relation facts. To further enhance the precision of UKGE, we also introduce probabilistic soft logic to infer confidence scores for unseen relation facts during training. We propose and evaluate two variants of UKGE based on different learning objectives. Experiments are conducted on three real-world uncertain KGs via three tasks, i.e. confidence prediction, relation fact ranking, and relation fact classification. UKGE shows effectiveness in capturing uncertain knowledge by achieving promising results on these tasks, and consistently outperforms baselines on these tasks.