无码人妻一区二区三区在线不卡,久久国产乱子伦精品噜噜,久久精品国产亚洲片,成年毛片在线全部免费播放

Query-focused summarization (QFS) aims to provide a summary of a document that satisfies information need of a given query and is useful in various IR applications, such as abstractive snippet generation. Current QFS approaches typically involve injecting additional information, e.g. query-answer relevance or fine-grained token-level interaction between a query and document, into a finetuned large language model. However, these approaches often require extra parameters \& training, and generalize poorly to new dataset distributions. To mitigate this, we propose leveraging a recently developed constrained generation model Neurological Decoding (NLD) as an alternative to current QFS regimes which rely on additional sub-architectures and training. We first construct lexical constraints by identifying important tokens from the document using a lightweight gradient attribution model, then subsequently force the generated summary to satisfy these constraints by directly manipulating the final vocabulary likelihood. This lightweight approach requires no additional parameters or finetuning as it utilizes both an off-the-shelf neural retrieval model to construct the constraints and a standard generative language model to produce the QFS. We demonstrate the efficacy of this approach on two public QFS collections achieving near parity with the state-of-the-art model with substantially reduced complexity.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · Learning · 代碼 · MoDELS · 小樣本學習 ·

2023 年 6 月 8 日

Large Language Models are Few-Shot Summarizers: Multi-Intent Comment Generation via In-Context Learning

Mingyang Geng,Shangwen Wang,Dezun Dong,Haotian Wang,Ge Li,Zhi Jin,Xiaoguang Mao,Xiangke Liao

from arxiv, Accepted by the 46th International Conference on Software Engineering (ICSE 2024)

Code comment generation aims at generating natural language descriptions for a code snippet to facilitate developers' program comprehension activities. Despite being studied for a long time, a bottleneck for existing approaches is that given a code snippet, they can only generate one comment while developers usually need to know information from diverse perspectives such as what is the functionality of this code snippet and how to use it. To tackle this limitation, this study empirically investigates the feasibility of utilizing large language models (LLMs) to generate comments that can fulfill developers' diverse intents. Our intuition is based on the facts that (1) the code and its pairwise comment are used during the pre-training process of LLMs to build the semantic connection between the natural language and programming language, and (2) comments in the real-world projects, which are collected for the pre-training, usually contain different developers' intents. We thus postulate that the LLMs can already understand the code from different perspectives after the pre-training. Indeed, experiments on two large-scale datasets demonstrate the rationale of our insights: by adopting the in-context learning paradigm and giving adequate prompts to the LLM (e.g., providing it with ten or more examples), the LLM can significantly outperform a state-of-the-art supervised learning approach on generating comments with multiple intents. Results also show that customized strategies for constructing the prompts and post-processing strategies for reranking the results can both boost the LLM's performances, which shed light on future research directions for using LLMs to achieve comment generation.

約束 · 控制器 · MoDELS · 語言模型化 · SimPLe ·

2023 年 6 月 8 日

Controlled Text Generation with Natural Language Instructions

Wangchunshu Zhou,Yuchen Eleanor Jiang,Ethan Wilcox,Ryan Cotterell,Mrinmaya Sachan

from arxiv, ICML 2023

Large language models generate fluent texts and can follow natural language instructions to solve a wide range of tasks without task-specific training. Nevertheless, it is notoriously difficult to control their generation to satisfy the various constraints required by different applications. In this work, we present InstructCTG, a controlled text generation framework that incorporates different constraints by conditioning on natural language descriptions and demonstrations of the constraints. In particular, we first extract the underlying constraints of natural texts through a combination of off-the-shelf NLP tools and simple heuristics. We then verbalize the constraints into natural language instructions to form weakly supervised training data. By prepending natural language descriptions of the constraints and a few demonstrations, we fine-tune a pre-trained language model to incorporate various types of constraints. Compared to existing search-based or score-based methods, InstructCTG is more flexible to different constraint types and has a much smaller impact on the generation quality and speed because it does not modify the decoding procedure. Additionally, InstructCTG allows the model to adapt to new constraints without re-training through the use of few-shot task generalization and in-context learning abilities of instruction-tuned language models.

無監督 · MoDELS · 簇 · 掩碼語言模型化 · state-of-the-art ·

2023 年 6 月 7 日

Absformer: Transformer-based Model for Unsupervised Multi-Document Abstractive Summarization

Mohamed Trabelsi,Huseyin Uzunalioglu

from arxiv, ICDAR 2023 International Workshop on Machine Learning (WML)

Multi-document summarization (MDS) refers to the task of summarizing the text in multiple documents into a concise summary. The generated summary can save the time of reading many documents by providing the important content in the form of a few sentences. Abstractive MDS aims to generate a coherent and fluent summary for multiple documents using natural language generation techniques. In this paper, we consider the unsupervised abstractive MDS setting where there are only documents with no groundtruh summaries provided, and we propose Absformer, a new Transformer-based method for unsupervised abstractive summary generation. Our method consists of a first step where we pretrain a Transformer-based encoder using the masked language modeling (MLM) objective as the pretraining task in order to cluster the documents into semantically similar groups; and a second step where we train a Transformer-based decoder to generate abstractive summaries for the clusters of documents. To our knowledge, we are the first to successfully incorporate a Transformer-based model to solve the unsupervised abstractive MDS task. We evaluate our approach using three real-world datasets from different domains, and we demonstrate both substantial improvements in terms of evaluation metrics over state-of-the-art abstractive-based methods, and generalization to datasets from different domains.

MoDELS · 語言模型化 · 優化器 · Continuity · Performer ·

2023 年 6 月 7 日

Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization

Pengcheng He,Baolin Peng,Liyang Lu,Song Wang,Jie Mei,Yang Liu,Ruochen Xu,Hany Hassan Awadalla,Yu Shi,Chenguang Zhu,Wayne Xiong,Michael Zeng,Jianfeng Gao,Xuedong Huang

from arxiv, 16 pages, 3 figures. Accepted as long paper in main conference of ACL 2023

This paper presents Z-Code++, a new pre-trained language model optimized for abstractive text summarization. The model extends the state of the art encoder-decoder model using three techniques. First, we use a two-phase pre-training process to improve model's performance on low-resource summarization tasks. The model is first pre-trained using text corpora for language understanding, and then is continually pre-trained on summarization corpora for grounded text generation. Second, we replace self-attention layers in the encoder with disentangled attention layers, where each word is represented using two vectors that encode its content and position, respectively. Third, we use fusion-in-encoder, a simple yet effective method of encoding long sequences in a hierarchical manner. Z-Code++ creates new state of the art on 9 out of 13 text summarization tasks across 5 languages. Our model is parameter-efficient in that it outperforms the 600x larger PaLM-540B on XSum, and the finetuned 200x larger GPT3-175B on SAMSum. In zero-shot and few-shot settings, our model substantially outperforms the competing models.

線性的 · Performer · MATLAB · Better · 情景 ·

2023 年 6 月 7 日

Comparison of SeDuMi and SDPT3 Solvers for Stability of Continuous-time Linear System

Guangda Xu

from arxiv, 13 pages, 12 figures

SeDuMi and SDPT3 are two solvers for solving Semi-definite Programming (SDP) or Linear Matrix Inequality (LMI) problems. A computational performance comparison of these two are undertaken in this paper regarding the Stability of Continuous-time Linear Systems. The comparison mainly focuses on computational times and memory requirements for different scales of problems. To implement and compare the two solvers on a set of well-posed problems, we employ YALMIP, a widely used toolbox for modeling and optimization in MATLAB. The primary goal of this study is to provide an empirical assessment of the relative computational efficiency of SeDuMi and SDPT3 under varying problem conditions. Our evaluation indicates that SDPT3 performs much better in large-scale, high-precision calculations.

MoDELS · 生成模型 · 蒸餾 · IP · 可辨認的 ·

2023 年 6 月 6 日

Protecting Language Generation Models via Invisible Watermarking

Xuandong Zhao,Yu-Xiang Wang,Lei Li

from arxiv, ICML 2023

Language generation models have been an increasingly powerful enabler for many applications. Many such models offer free or affordable API access, which makes them potentially vulnerable to model extraction attacks through distillation. To protect intellectual property (IP) and ensure fair use of these models, various techniques such as lexical watermarking and synonym replacement have been proposed. However, these methods can be nullified by obvious countermeasures such as "synonym randomization". To address this issue, we propose GINSEW, a novel method to protect text generation models from being stolen through distillation. The key idea of our method is to inject secret signals into the probability vector of the decoding steps for each target token. We can then detect the secret message by probing a suspect model to tell if it is distilled from the protected one. Experimental results show that GINSEW can effectively identify instances of IP infringement with minimal impact on the generation quality of protected APIs. Our method demonstrates an absolute improvement of 19 to 29 points on mean average precision (mAP) in detecting suspects compared to previous methods against watermark removal attacks.

MoDELS · SOFT · Learning · 稀疏激活 · 離散化 ·

2023 年 6 月 6 日

Soft Merging of Experts with Adaptive Routing

Mohammed Muqeeth,Haokun Liu,Colin Raffel

Sparsely activated neural networks with conditional computation learn to route their inputs through different "expert" subnetworks, providing a form of modularity that densely activated models lack. Despite their possible benefits, models with learned routing often underperform their parameter-matched densely activated counterparts as well as models that use non-learned heuristic routing strategies. In this paper, we hypothesize that these shortcomings stem from the gradient estimation techniques used to train sparsely activated models that use non-differentiable discrete routing decisions. To address this issue, we introduce Soft Merging of Experts with Adaptive Routing (SMEAR), which avoids discrete routing by using a single "merged" expert constructed via a weighted average of all of the experts' parameters. By routing activations through a single merged expert, SMEAR does not incur a significant increase in computational costs and enables standard gradient-based training. We empirically validate that models using SMEAR outperform models that route based on metadata or learn sparse routing through gradient estimation. Furthermore, we provide qualitative analysis demonstrating that the experts learned via SMEAR exhibit a significant amount of specialization. All of the code used in our experiments is publicly available.

MoDELS · 講稿 · CASE · Performer · 樣例 ·

2023 年 6 月 6 日

Phonetically-Grounded Language Generation: The Case of Tongue Twisters

Tyler Loakman,Chen Tang,Chenghua Lin

Previous work in phonetically-grounded language generation has mainly focused on domains such as lyrics and poetry. In this paper, we present work on the generation of tongue twisters - a form of language that is required to be phonetically conditioned to maximise sound overlap, whilst maintaining semantic consistency with an input topic, and still being grammatically correct. We present \textbf{TwistList}, a large annotated dataset of tongue twisters, consisting of 2.1K+ human-authored examples. We additionally present several benchmark systems (referred to as TwisterMisters) for the proposed task of tongue twister generation, including models that both do and do not require training on in-domain data. We present the results of automatic and human evaluation to demonstrate the performance of existing mainstream pre-trained models in this task with limited (or no) task specific training and data, and no explicit phonetic knowledge. We find that the task of tongue twister generation is challenging for models under these conditions, yet some models are still capable of generating acceptable examples of this language type.

Performer · 語言模型化 · Performance · MoDELS · Prompt ·

2023 年 6 月 6 日

Impact of Large Language Models on Generating Software Specifications

Danning Xie,Byungwoo Yoo,Nan Jiang,Mijung Kim,Lin Tan,Xiangyu Zhang,Judy S. Lee

Software specifications are essential for ensuring the reliability of software systems. Existing specification extraction approaches, however, suffer from limited generalizability and require manual efforts. We study the effectiveness of Large Language Models (LLMs) in generating software specifications from software documentation, utilizing Few-Shot Learning (FSL) to enable LLMs to generalize from a small number of examples. We compare the performance of LLMs with FSL to that of state-of-the-art specification extraction techniques and study the impact of prompt construction strategies on LLM performance. In addition, we conduct a comprehensive analysis of their symptoms and root causes of the failures to understand the pros and cons of LLMs and existing approaches. We also compare 11 LLMs' performance, cost, and response time for generating software specifications. Our findings include that (1) the best performing LLM outperforms existing approaches by 9.1--13.7% with a few similar examples, (2) the two dominant root causes combined (ineffective prompts and missing domain knowledge) result in 57--60% of LLM failures, and (3) most of the 11 LLMs achieve better or comparable performance compared to traditional techniques. Our study offers valuable insights for future research to improve specification generation.

語言模型化 · MoDELS · IR · 似然 · 掩碼語言模型化 ·

2020 年 10 月 20 日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Xinyu Ma,Jiafeng Guo,Ruqing Zhang,Yixing Fan,Xiang Ji,Xueqi Cheng

from arxiv, Accepted by WSDM2021

Recently pre-trained language representation models such as BERT have shown great success when fine-tuned on downstream tasks including information retrieval (IR). However, pre-training objectives tailored for ad-hoc retrieval have not been well explored. In this paper, we propose Pre-training with Representative wOrds Prediction (PROP) for ad-hoc retrieval. PROP is inspired by the classical statistical language model for IR, specifically the query likelihood model, which assumes that the query is generated as the piece of text representative of the "ideal" document. Based on this idea, we construct the representative words prediction (ROP) task for pre-training. Given an input document, we sample a pair of word sets according to the document language model, where the set with higher likelihood is deemed as more representative of the document. We then pre-train the Transformer model to predict the pairwise preference between the two word sets, jointly with the Masked Language Model (MLM) objective. By further fine-tuning on a variety of representative downstream ad-hoc retrieval tasks, PROP achieves significant improvements over baselines without pre-training or with other pre-training methods. We also show that PROP can achieve exciting performance under both the zero- and low-resource IR settings. The code and pre-trained models are available at //github.com/Albert-Ma/PROP.