日本三级网站在线播放,亚日韩中文无码视频,女性潮喷黄色网站在线浏览,国产精品一二三区免费

Automatic fact-checking plays a crucial role in combating the spread of misinformation. Large Language Models (LLMs) and Instruction-Following variants, such as InstructGPT and Alpaca, have shown remarkable performance in various natural language processing tasks. However, their knowledge may not always be up-to-date or sufficient, potentially leading to inaccuracies in fact-checking. To address this limitation, we propose combining the power of instruction-following language models with external evidence retrieval to enhance fact-checking performance. Our approach involves leveraging search engines to retrieve relevant evidence for a given input claim. This external evidence serves as valuable supplementary information to augment the knowledge of the pretrained language model. Then, we instruct-tune an open-sourced language model, called LLaMA, using this evidence, enabling it to predict the veracity of the input claim more accurately. To evaluate our method, we conducted experiments on two widely used fact-checking datasets: RAWFC and LIAR. The results demonstrate that our approach achieves state-of-the-art performance in fact-checking tasks. By integrating external evidence, we bridge the gap between the model's knowledge and the most up-to-date and sufficient context available, leading to improved fact-checking outcomes. Our findings have implications for combating misinformation and promoting the dissemination of accurate information on online platforms. Our released materials are accessible at: //thcheung.github.io/factllama.

相關內容

語言模型化

關注 9

圖 · MoDELS · 可理解性 · Extensibility · LORA ·

2023 年 10 月 19 日

MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter

Zhiyuan Liu,Sihang Li,Yanchen Luo,Hao Fei,Yixin Cao,Kenji Kawaguchi,Xiang Wang,Tat-Seng Chua

from arxiv, EMNLP main conference. 9 pages

Language Models (LMs) have demonstrated impressive molecule understanding ability on various 1D text-related tasks. However, they inherently lack 2D graph perception - a critical ability of human professionals in comprehending molecules' topological structures. To bridge this gap, we propose MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter. MolCA enables an LM (e.g., Galactica) to understand both text- and graph-based molecular contents via the cross-modal projector. Specifically, the cross-modal projector is implemented as a Q-Former to connect a graph encoder's representation space and an LM's text space. Further, MolCA employs a uni-modal adapter (i.e., LoRA) for the LM's efficient adaptation to downstream tasks. Unlike previous studies that couple an LM with a graph encoder via cross-modal contrastive learning, MolCA retains the LM's ability of open-ended text generation and augments it with 2D graph information. To showcase its effectiveness, we extensively benchmark MolCA on tasks of molecule captioning, IUPAC name prediction, and molecule-text retrieval, on which MolCA significantly outperforms the baselines. Our codes and checkpoints can be found at //github.com/acharkq/MolCA.

TSC · Extensibility · 數據增強 · Taxonomy · Analysis ·

2023 年 10 月 19 日

Data Augmentation for Time-Series Classification: An Extensive Empirical Study and Comprehensive Survey

Zijun Gao,Lingbo Li,Tianhua Xu

Data Augmentation (DA) has emerged as an indispensable strategy in Time Series Classification (TSC), primarily due to its capacity to amplify training samples, thereby bolstering model robustness, diversifying datasets, and curtailing overfitting. However, the current landscape of DA in TSC is plagued with fragmented literature reviews, nebulous methodological taxonomies, inadequate evaluative measures, and a dearth of accessible, user-oriented tools. In light of these challenges, this study embarks on an exhaustive dissection of DA methodologies within the TSC realm. Our initial approach involved an extensive literature review spanning a decade, revealing that contemporary surveys scarcely capture the breadth of advancements in DA for TSC, prompting us to meticulously analyze over 100 scholarly articles to distill more than 60 unique DA techniques. This rigorous analysis precipitated the formulation of a novel taxonomy, purpose-built for the intricacies of DA in TSC, categorizing techniques into five principal echelons: Transformation-Based, Pattern-Based, Generative, Decomposition-Based, and Automated Data Augmentation. Our taxonomy promises to serve as a robust navigational aid for scholars, offering clarity and direction in method selection. Addressing the conspicuous absence of holistic evaluations for prevalent DA techniques, we executed an all-encompassing empirical assessment, wherein upwards of 15 DA strategies were subjected to scrutiny across 8 UCR time-series datasets, employing ResNet and a multi-faceted evaluation paradigm encompassing Accuracy, Method Ranking, and Residual Analysis, yielding a benchmark accuracy of 88.94 +- 11.83%. Our investigation underscored the inconsistent efficacies of DA techniques, with...

Prompt · 小樣本學習 · 語言模型化 · 知識 (knowledge) · 基 ·

2023 年 10 月 19 日

Prompting Large Language Models with Chain-of-Thought for Few-Shot Knowledge Base Question Generation

Yuanyuan Liang,Jianing Wang,Hanlun Zhu,Lei Wang,Weining Qian,Yunshi Lan

from arxiv, Accepted by EMNLP 2023 main conference

The task of Question Generation over Knowledge Bases (KBQG) aims to convert a logical form into a natural language question. For the sake of expensive cost of large-scale question annotation, the methods of KBQG under low-resource scenarios urgently need to be developed. However, current methods heavily rely on annotated data for fine-tuning, which is not well-suited for few-shot question generation. The emergence of Large Language Models (LLMs) has shown their impressive generalization ability in few-shot tasks. Inspired by Chain-of-Thought (CoT) prompting, which is an in-context learning strategy for reasoning, we formulate KBQG task as a reasoning problem, where the generation of a complete question is splitted into a series of sub-question generation. Our proposed prompting method KQG-CoT first retrieves supportive logical forms from the unlabeled data pool taking account of the characteristics of the logical form. Then, we write a prompt to explicit the reasoning chain of generating complicated questions based on the selected demonstrations. To further ensure prompt quality, we extend KQG-CoT into KQG-CoT+ via sorting the logical forms by their complexity. We conduct extensive experiments over three public KBQG datasets. The results demonstrate that our prompting method consistently outperforms other prompting baselines on the evaluated datasets. Remarkably, our KQG-CoT+ method could surpass existing few-shot SoTA results of the PathQuestions dataset by 18.25, 10.72, and 10.18 absolute points on BLEU-4, METEOR, and ROUGE-L, respectively.

INTERACT · 數據集 · HTTPS · 可約的 · Extensibility ·

2023 年 10 月 18 日

InViG: Benchmarking Interactive Visual Grounding with 500K Human-Robot Interactions

Hanbo Zhang,Jie Xu,Yuchen Mo,Tao Kong

from arxiv, 8 pages, 9 figures, 3 tables, under review

Ambiguity is ubiquitous in human communication. Previous approaches in Human-Robot Interaction (HRI) have often relied on predefined interaction templates, leading to reduced performance in realistic and open-ended scenarios. To address these issues, we present a large-scale dataset, \invig, for interactive visual grounding under language ambiguity. Our dataset comprises over 520K images accompanied by open-ended goal-oriented disambiguation dialogues, encompassing millions of object instances and corresponding question-answer pairs. Leveraging the \invig dataset, we conduct extensive studies and propose a set of baseline solutions for end-to-end interactive visual disambiguation and grounding, achieving a 45.6\% success rate during validation. To the best of our knowledge, the \invig dataset is the first large-scale dataset for resolving open-ended interactive visual grounding, presenting a practical yet highly challenging benchmark for ambiguity-aware HRI. Codes and datasets are available at: \href{//openivg.github.io}{//openivg.github.io}.

INFORMS · 信息抽取 · Performer · MoDELS · 命名實體識別 ·

2023 年 10 月 18 日

RexUIE: A Recursive Method with Explicit Schema Instructor for Universal Information Extraction

Chengyuan Liu,Fubang Zhao,Yangyang Kang,Jingyuan Zhang,Xiang Zhou,Changlong Sun,Kun Kuang,Fei Wu

from arxiv, Findings of EMNLP 2023

Universal Information Extraction (UIE) is an area of interest due to the challenges posed by varying targets, heterogeneous structures, and demand-specific schemas. However, previous works have only achieved limited success by unifying a few tasks, such as Named Entity Recognition (NER) and Relation Extraction (RE), which fall short of being authentic UIE models particularly when extracting other general schemas such as quadruples and quintuples. Additionally, these models used an implicit structural schema instructor, which could lead to incorrect links between types, hindering the model's generalization and performance in low-resource scenarios. In this paper, we redefine the authentic UIE with a formal formulation that encompasses almost all extraction schemas. To the best of our knowledge, we are the first to introduce UIE for any kind of schemas. In addition, we propose RexUIE, which is a Recursive Method with Explicit Schema Instructor for UIE. To avoid interference between different types, we reset the position ids and attention mask matrices. RexUIE shows strong performance under both full-shot and few-shot settings and achieves State-of-the-Art results on the tasks of extracting complex schemas.

語言模型化 · MoDELS · Performer · Branch · 層 ·

2023 年 10 月 18 日

From CLIP to DINO: Visual Encoders Shout in Multi-modal Large Language Models

Dongsheng Jiang,Yuchen Liu,Songlin Liu,Xiaopeng Zhang,Jin Li,Hongkai Xiong,Qi Tian

from arxiv, The paper is under the corporation's legal review

Multi-modal Large Language Models (MLLMs) have made significant strides in expanding the capabilities of Large Language Models (LLMs) through the incorporation of visual perception interfaces. Despite the emergence of exciting applications and the availability of diverse instruction tuning data, existing approaches often rely on CLIP or its variants as the visual branch, and merely extract features from the deep layers. However, these methods lack a comprehensive analysis of the visual encoders in MLLMs. In this paper, we conduct an extensive investigation into the effectiveness of different vision encoders within MLLMs. Our findings reveal that the shallow layer features of CLIP offer particular advantages for fine-grained tasks such as grounding and region understanding. Surprisingly, the vision-only model DINO, which is not pretrained with text-image alignment, demonstrates promising performance as a visual branch within MLLMs. By simply equipping it with an MLP layer for alignment, DINO surpasses CLIP in fine-grained related perception tasks. Building upon these observations, we propose a simple yet effective feature merging strategy, named COMM, that integrates CLIP and DINO with Multi-level features Merging, to enhance the visual capabilities of MLLMs. We evaluate COMM through comprehensive experiments on a wide range of benchmarks, including image captioning, visual question answering, visual grounding, and object hallucination. Experimental results demonstrate the superior performance of COMM compared to existing methods, showcasing its enhanced visual capabilities within MLLMs. Code will be made available at //github.com/YuchenLiu98/COMM.

SimPLe · 預測器/決策函數 · 多樣性 · entity · 知識 (knowledge) ·

2023 年 10 月 17 日

Query2Triple: Unified Query Encoding for Answering Diverse Complex Queries over Knowledge Graphs

Yao Xu,Shizhu He,Cunguang Wang,Li Cai,Kang Liu,Jun Zhao

from arxiv, Accepted by EMNLP 2023 findings

Complex Query Answering (CQA) is a challenge task of Knowledge Graph (KG). Due to the incompleteness of KGs, query embedding (QE) methods have been proposed to encode queries and entities into the same embedding space, and treat logical operators as neural set operators to obtain answers. However, these methods train KG embeddings and neural set operators concurrently on both simple (one-hop) and complex (multi-hop and logical) queries, which causes performance degradation on simple queries and low training efficiency. In this paper, we propose Query to Triple (Q2T), a novel approach that decouples the training for simple and complex queries. Q2T divides the training into two stages: (1) Pre-training a neural link predictor on simple queries to predict tail entities based on the head entity and relation. (2) Training a query encoder on complex queries to encode diverse complex queries into a unified triple form that can be efficiently solved by the pretrained neural link predictor. Our proposed Q2T is not only efficient to train, but also modular, thus easily adaptable to various neural link predictors that have been studied well. Extensive experiments demonstrate that, even without explicit modeling for neural set operators, Q2T still achieves state-of-the-art performance on diverse complex queries over three public benchmarks.

Agent · 有偏 · 可約的 · HCI · INTERACT ·

2023 年 10 月 17 日

RAH! RecSys-Assistant-Human: A Human-Centered Recommendation Framework with LLM Agents

Yubo Shu,Haonan Zhang,Hansu Gu,Peng Zhang,Tun Lu,Dongsheng Li,Ning Gu

The rapid evolution of the web has led to an exponential growth in content. Recommender systems play a crucial role in Human-Computer Interaction (HCI) by tailoring content based on individual preferences. Despite their importance, challenges persist in balancing recommendation accuracy with user satisfaction, addressing biases while preserving user privacy, and solving cold-start problems in cross-domain situations. This research argues that addressing these issues is not solely the recommender systems' responsibility, and a human-centered approach is vital. We introduce the RAH Recommender system, Assistant, and Human) framework, an innovative solution with LLM-based agents such as Perceive, Learn, Act, Critic, and Reflect, emphasizing the alignment with user personalities. The framework utilizes the Learn-Act-Critic loop and a reflection mechanism for improving user alignment. Using the real-world data, our experiments demonstrate the RAH framework's efficacy in various recommendation domains, from reducing human burden to mitigating biases and enhancing user control. Notably, our contributions provide a human-centered recommendation framework that partners effectively with various recommendation models.

Networking · E2E · 泛函 · SimPLe · 最優化 ·

2023 年 10 月 17 日

Slicenet: a Simple and Scalable Flow-Level Simulator for Network Slice Provisioning and Management

Viswanath KumarSkandPriya,Abdulhalim Dandoush,Gladys Diaz

Network slicing plays a crucial role in the progression of 5G and beyond, facilitating dedicated logical networks to meet diverse and specific service requirements. The principle of End-to-End (E2E) slice includes not only a service chain of physical or virtual functions for the radio and core of 5G/6G networks but also the full path to the application servers that might be running at some edge computing or at central cloud. Nonetheless, the development and optimization of E2E network slice management systems necessitate a reliable simulation tool for evaluating different aspects at large-scale network topologies such as resource allocation and function placement models. This paper introduces Slicenet, a mininetlike simulator crafted for E2E network slicing experimentation at the flow level. Slicenet aims at facilitating the investigation of a wide range of slice optimization techniques, delivering measurable, reproducible results without the need for physical resources or complex integration tools. It provides a well-defined process for conducting experiments, which includes the creation and implementation of policies for various components such as edge and central cloud resources, network functions of multiple slices of different characteristics. Furthermore, Slicenet effortlessly produces meaningful visualizations from simulation results, aiding in comprehensive understanding. Utilizing Slicenet, service providers can derive invaluable insights into resource optimization, capacity planning, Quality of Service (QoS) assessment, cost optimization, performance comparison, risk mitigation, and Service Level Agreement (SLA) compliance, thereby fortifying network resource management and slice orchestration.

級聯 · Learning · 可辨認的 · 可約的 · Extensibility ·

2023 年 10 月 17 日

NICE: Improving Panoptic Narrative Detection and Segmentation with Cascading Collaborative Learning

Haowei Wang,Jiayi Ji,Tianyu Guo,Yilong Yang,Yiyi Zhou,Xiaoshuai Sun,Rongrong Ji

from arxiv, 18 pages. 9 figures, 9 tables

Panoptic Narrative Detection (PND) and Segmentation (PNS) are two challenging tasks that involve identifying and locating multiple targets in an image according to a long narrative description. In this paper, we propose a unified and effective framework called NICE that can jointly learn these two panoptic narrative recognition tasks. Existing visual grounding tasks use a two-branch paradigm, but applying this directly to PND and PNS can result in prediction conflict due to their intrinsic many-to-many alignment property. To address this, we introduce two cascading modules based on the barycenter of the mask, which are Coordinate Guided Aggregation (CGA) and Barycenter Driven Localization (BDL), responsible for segmentation and detection, respectively. By linking PNS and PND in series with the barycenter of segmentation as the anchor, our approach naturally aligns the two tasks and allows them to complement each other for improved performance. Specifically, CGA provides the barycenter as a reference for detection, reducing BDL's reliance on a large number of candidate boxes. BDL leverages its excellent properties to distinguish different instances, which improves the performance of CGA for segmentation. Extensive experiments demonstrate that NICE surpasses all existing methods by a large margin, achieving 4.1% for PND and 2.9% for PNS over the state-of-the-art. These results validate the effectiveness of our proposed collaborative learning strategy. The project of this work is made publicly available at //github.com/Mr-Neko/NICE.