亚州AV无码专区在线电影_亚洲精品久久无码WW16_国产每日更新AV不卡观看_产黄色视频在线观看国_色婷婷狠狠18禁久久_欧美精选视频HH_十三以下岁女子毛片免费播放

Large Language Models (LLMs), exemplified by ChatGPT, have significantly reshaped text generation, particularly in the realm of writing assistance. While ethical considerations underscore the importance of transparently acknowledging LLM use, especially in scientific communication, genuine acknowledgment remains infrequent. A potential avenue to encourage accurate acknowledging of LLM-assisted writing involves employing automated detectors. Our evaluation of four cutting-edge LLM-generated text detectors reveals their suboptimal performance compared to a simple ad-hoc detector designed to identify abrupt writing style changes around the time of LLM proliferation. We contend that the development of specialized detectors exclusively dedicated to LLM-assisted writing detection is necessary. Such detectors could play a crucial role in fostering more authentic recognition of LLM involvement in scientific communication, addressing the current challenges in acknowledgment practices.

相關內容

大語言模型

關注 56

大語(yu)言模(mo)(mo)型是基于海量文本(ben)數(shu)(shu)據訓練的(de)深度學習模(mo)(mo)型。它不僅能(neng)(neng)夠(gou)(gou)(gou)生(sheng)(sheng)成自(zi)然語(yu)言文本(ben)，還能(neng)(neng)夠(gou)(gou)(gou)深入(ru)理解文本(ben)含義，處(chu)理各種(zhong)自(zi)然語(yu)言任務，如(ru)文本(ben)摘要、問答、翻(fan)譯等(deng)(deng)。2023年，大語(yu)言模(mo)(mo)型及其(qi)在人(ren)工智能(neng)(neng)領域的(de)應用已成為全球科技研究的(de)熱點，其(qi)在規模(mo)(mo)上的(de)增長(chang)尤為引人(ren)注目(mu)，參數(shu)(shu)量已從最初(chu)的(de)十幾億躍升到如(ru)今的(de)一(yi)萬億。參數(shu)(shu)量的(de)提升使得模(mo)(mo)型能(neng)(neng)夠(gou)(gou)(gou)更(geng)加精細地(di)捕捉人(ren)類(lei)語(yu)言微妙之(zhi)處(chu)，更(geng)加深入(ru)地(di)理解人(ren)類(lei)語(yu)言的(de)復(fu)雜性(xing)(xing)。在過去(qu)的(de)一(yi)年里(li)，大語(yu)言模(mo)(mo)型在吸納新知識、分解復(fu)雜任務以及圖文對(dui)齊等(deng)(deng)多方面都有(you)顯著提升。隨(sui)著技術的(de)不斷(duan)成熟，它將不斷(duan)拓展其(qi)應用范圍，為人(ren)類(lei)提供更(geng)加智能(neng)(neng)化和個(ge)性(xing)(xing)化的(de)服務，進(jin)一(yi)步改善人(ren)們的(de)生(sheng)(sheng)活和生(sheng)(sheng)產方式。

圖 · Neural Networks · Networks · MoDELS · GNN ·

2024 年 3 月 11 日

Uncertainty in Graph Neural Networks: A Survey

Fangxin Wang,Yuqing Liu,Kay Liu,Yibo Wang,Sourav Medya,Philip S. Yu

from arxiv, 13 main pages, 3 figures, 1 table. Under review

Graph Neural Networks (GNNs) have been extensively used in various real-world applications. However, the predictive uncertainty of GNNs stemming from diverse sources such as inherent randomness in data and model training errors can lead to unstable and erroneous predictions. Therefore, identifying, quantifying, and utilizing uncertainty are essential to enhance the performance of the model for the downstream tasks as well as the reliability of the GNN predictions. This survey aims to provide a comprehensive overview of the GNNs from the perspective of uncertainty with an emphasis on its integration in graph learning. We compare and summarize existing graph uncertainty theory and methods, alongside the corresponding downstream tasks. Thereby, we bridge the gap between theory and practice, meanwhile connecting different GNN communities. Moreover, our work provides valuable insights into promising directions in this field.

MoDELS · 模型評估 · 數據集 · 長短期記憶網絡 · Networking ·

2024 年 3 月 11 日

LSTM-Based Text Generation: A Study on Historical Datasets

Mustafa Abbas Hussein Hussein,Serkan Sava?

This paper presents an exploration of Long Short-Term Memory (LSTM) networks in the realm of text generation, focusing on the utilization of historical datasets for Shakespeare and Nietzsche. LSTMs, known for their effectiveness in handling sequential data, are applied here to model complex language patterns and structures inherent in historical texts. The study demonstrates that LSTM-based models, when trained on historical datasets, can not only generate text that is linguistically rich and contextually relevant but also provide insights into the evolution of language patterns over time. The finding presents models that are highly accurate and efficient in predicting text from works of Nietzsche, with low loss values and a training time of 100 iterations. The accuracy of the model is 0.9521, indicating high accuracy. The loss of the model is 0.2518, indicating its effectiveness. The accuracy of the model in predicting text from the work of Shakespeare is 0.9125, indicating a low error rate. The training time of the model is 100, mirroring the efficiency of the Nietzsche dataset. This efficiency demonstrates the effectiveness of the model design and training methodology, especially when handling complex literary texts. This research contributes to the field of natural language processing by showcasing the versatility of LSTM networks in text generation and offering a pathway for future explorations in historical linguistics and beyond.

MoDELS · CASES · 數據集 · AI · INFORMS ·

2024 年 3 月 11 日

On the Consideration of AI Openness: Can Good Intent Be Abused?

Yeeun Kim,Eunkyung Choi,Hyunjun Kim,Hongseok Oh,Hyunseo Shin,Wonseok Hwang

from arxiv, 10 pages

Openness is critical for the advancement of science. In particular, recent rapid progress in AI has been made possible only by various open-source models, datasets, and libraries. However, this openness also means that technologies can be freely used for socially harmful purposes. Can open-source models or datasets be used for malicious purposes? If so, how easy is it to adapt technology for such goals? Here, we conduct a case study in the legal domain, a realm where individual decisions can have profound social consequences. To this end, we build EVE, a dataset consisting of 200 examples of questions and corresponding answers about criminal activities based on 200 Korean precedents. We found that a widely accepted open-source LLM, which initially refuses to answer unethical questions, can be easily tuned with EVE to provide unethical and informative answers about criminal activities. This implies that although open-source technologies contribute to scientific progress, some care must be taken to mitigate possible malicious use cases. Warning: This paper contains contents that some may find unethical.

語言模型化 · 可辨認的 · 大語言模型 · 穩健性 · MoDELS ·

2024 年 3 月 11 日

DeepTextMark: A Deep Learning-Driven Text Watermarking Approach for Identifying Large Language Model Generated Text

Travis Munyer,Abdullah Tanvir,Arjon Das,Xin Zhong

from arxiv, The paper has been accpeted for publication by IEEE Access

The rapid advancement of Large Language Models (LLMs) has significantly enhanced the capabilities of text generators. With the potential for misuse escalating, the importance of discerning whether texts are human-authored or generated by LLMs has become paramount. Several preceding studies have ventured to address this challenge by employing binary classifiers to differentiate between human-written and LLM-generated text. Nevertheless, the reliability of these classifiers has been subject to question. Given that consequential decisions may hinge on the outcome of such classification, it is imperative that text source detection is of high caliber. In light of this, the present paper introduces DeepTextMark, a deep learning-driven text watermarking methodology devised for text source identification. By leveraging Word2Vec and Sentence Encoding for watermark insertion, alongside a transformer-based classifier for watermark detection, DeepTextMark epitomizes a blend of blindness, robustness, imperceptibility, and reliability. As elaborated within the paper, these attributes are crucial for universal text source detection, with a particular emphasis in this paper on text produced by LLMs. DeepTextMark offers a viable "add-on" solution to prevailing text generation frameworks, requiring no direct access or alterations to the underlying text generation mechanism. Experimental evaluations underscore the high imperceptibility, elevated detection accuracy, augmented robustness, reliability, and swift execution of DeepTextMark.

INFORMS · MoDELS · 評論員 · HTTPS · Analysis ·

2024 年 3 月 8 日

ROUGE-K: Do Your Summaries Have Keywords?

Sotaro Takeshita,Simone Paolo Ponzetto,Kai Eckert

Keywords, that is, content-relevant words in summaries play an important role in efficient information conveyance, making it critical to assess if system-generated summaries contain such informative words during evaluation. However, existing evaluation metrics for extreme summarization models do not pay explicit attention to keywords in summaries, leaving developers ignorant of their presence. To address this issue, we present a keyword-oriented evaluation metric, dubbed ROUGE-K, which provides a quantitative answer to the question of -- \textit{How well do summaries include keywords?} Through the lens of this keyword-aware metric, we surprisingly find that a current strong baseline model often misses essential information in their summaries. Our analysis reveals that human annotators indeed find the summaries with more keywords to be more relevant to the source documents. This is an important yet previously overlooked aspect in evaluating summarization systems. Finally, to enhance keyword inclusion, we propose four approaches for incorporating word importance into a transformer-based model and experimentally show that it enables guiding models to include more keywords while keeping the overall quality. Our code is released at //github.com/sobamchan/rougek.

Performer · RAVEN · 多樣性 · CLUES · 多跳 ·

2024 年 3 月 8 日

How Far Are We from Intelligent Visual Deductive Reasoning?

Yizhe Zhang,He Bai,Ruixiang Zhang,Jiatao Gu,Shuangfei Zhai,Josh Susskind,Navdeep Jaitly

from arxiv, ICLR 2024 AGI workshop. //github.com/apple/ml-rpm-bench

Vision-Language Models (VLMs) such as GPT-4V have recently demonstrated incredible strides on diverse vision language tasks. We dig into vision-based deductive reasoning, a more sophisticated but less explored realm, and find previously unexposed blindspots in the current SOTA VLMs. Specifically, we leverage Raven's Progressive Matrices (RPMs), to assess VLMs' abilities to perform multi-hop relational and deductive reasoning relying solely on visual clues. We perform comprehensive evaluations of several popular VLMs employing standard strategies such as in-context learning, self-consistency, and Chain-of-thoughts (CoT) on three diverse datasets, including the Mensa IQ test, IntelligenceTest, and RAVEN. The results reveal that despite the impressive capabilities of LLMs in text-based reasoning, we are still far from achieving comparable proficiency in visual deductive reasoning. We found that certain standard strategies that are effective when applied to LLMs do not seamlessly translate to the challenges presented by visual reasoning tasks. Moreover, a detailed analysis reveals that VLMs struggle to solve these tasks mainly because they are unable to perceive and comprehend multiple, confounding abstract patterns in RPM examples.

大語言模型 · 可理解性 · 語言模型化 · MoDELS · entity ·

2024 年 3 月 8 日

Do Large Language Model Understand Multi-Intent Spoken Language ?

Shangjian Yin,Peijie Huang,Yuhong Xu,Haojing Huang,Jiatian Chen

This study marks a significant advancement by harnessing Large Language Models (LLMs) for multi-intent spoken language understanding (SLU), proposing a unique methodology that capitalizes on the generative power of LLMs within an SLU context. Our innovative technique reconfigures entity slots specifically for LLM application in multi-intent SLU environments and introduces the concept of Sub-Intent Instruction (SII), enhancing the dissection and interpretation of intricate, multi-intent communication within varied domains. The resultant datasets, dubbed LM-MixATIS and LM-MixSNIPS, are crafted from pre-existing benchmarks. Our research illustrates that LLMs can match and potentially excel beyond the capabilities of current state-of-the-art multi-intent SLU models. It further explores LLM efficacy across various intent configurations and dataset proportions. Moreover, we introduce two pioneering metrics, Entity Slot Accuracy (ESA) and Combined Semantic Accuracy (CSA), to provide an in-depth analysis of LLM proficiency in this complex field.

AdderNet · Neural Networks · Networking · 卷積 · 模型評估 ·

2019 年 12 月 31 日

AdderNet: Do We Really Need Multiplications in Deep Learning?

Hanting Chen,Yunhe Wang,Chunjing Xu,Boxin Shi,Chao Xu,Qi Tian,Chang Xu

Compared with cheap addition operation, multiplication operation is of much higher computation complexity. The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values. In this paper, we present adder networks (AdderNets) to trade these massive multiplications in deep neural networks, especially convolutional neural networks (CNNs), for much cheaper additions to reduce computation costs. In AdderNets, we take the $\ell_1$-norm distance between filters and input feature as the output response. The influence of this new similarity measure on the optimization of neural network have been thoroughly analyzed. To achieve a better performance, we develop a special back-propagation approach for AdderNets by investigating the full-precision gradient. We then propose an adaptive learning rate strategy to enhance the training procedure of AdderNets according to the magnitude of each neuron's gradient. As a result, the proposed AdderNets can achieve 74.9% Top-1 accuracy 91.7% Top-5 accuracy using ResNet-50 on the ImageNet dataset without any multiplication in convolution layer.

同義詞集 · 知識庫 · 基 · INFORMS · 極小點 ·

2019 年 12 月 4 日

Towards Building a Multilingual Sememe Knowledge Base: Predicting Sememes for BabelNet Synsets

Fanchao Qi,Liang Chang,Maosong Sun,Sicong Ouyang,Zhiyuan Liu

from arxiv, Accepted by AAAI Conference on Artificial Intelligence 2020 for oral presentation

A sememe is defined as the minimum semantic unit of human languages. Sememe knowledge bases (KBs), which contain words annotated with sememes, have been successfully applied to many NLP tasks. However, existing sememe KBs are built on only a few languages, which hinders their widespread utilization. To address the issue, we propose to build a unified sememe KB for multiple languages based on BabelNet, a multilingual encyclopedic dictionary. We first build a dataset serving as the seed of the multilingual sememe KB. It manually annotates sememes for over $15$ thousand synsets (the entries of BabelNet). Then, we present a novel task of automatic sememe prediction for synsets, aiming to expand the seed dataset into a usable KB. We also propose two simple and effective models, which exploit different information of synsets. Finally, we conduct quantitative and qualitative analyses to explore important factors and difficulties in the task. All the source code and data of this work can be obtained on //github.com/thunlp/BabelNet-Sememe-Prediction.

學成 · 深度學習 · 可辨認的 · MoDELS · 目標跟蹤 ·

2019 年 7 月 31 日

Deep Learning in Video Multi-Object Tracking: A Survey

Gioele Ciaparrone,Francisco Luque Sánchez,Siham Tabik,Luigi Troiano,Roberto Tagliaferri,Francisco Herrera

from arxiv, New in v2: corrected typos and various minor mistakes. Submitted to Neurocomputing. Main text: 25 pages, 5 figures, 6 tables. Summary table in appendix at the end of the paper

The problem of Multiple Object Tracking (MOT) consists in following the trajectory of different objects in a sequence, usually a video. In recent years, with the rise of Deep Learning, the algorithms that provide a solution to this problem have benefited from the representational power of deep models. This paper provides a comprehensive survey on works that employ Deep Learning models to solve the task of MOT on single-camera videos. Four main steps in MOT algorithms are identified, and an in-depth review of how Deep Learning was employed in each one of these stages is presented. A complete experimental comparison of the presented works on the three MOTChallenge datasets is also provided, identifying a number of similarities among the top-performing methods and presenting some possible future research directions.