销魂美女一区二区三区AV,91成人精品爽啪在线观看,黄色网站网址免费直接看,国产V黄视频在线,亚洲国产另类久久久精品网站天堂

from arxiv, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Recently, the development and progress of Large Language Models (LLMs) have amazed the entire Artificial Intelligence community. Benefiting from their emergent abilities, LLMs have attracted more and more researchers to study their capabilities and performance on various downstream Natural Language Processing (NLP) tasks. While marveling at LLMs' incredible performance on all kinds of tasks, we notice that they also have excellent multilingual processing capabilities, such as Chinese. To explore the Chinese processing ability of LLMs, we focus on Chinese Text Correction, a fundamental and challenging Chinese NLP task. Specifically, we evaluate various representative LLMs on the Chinese Grammatical Error Correction (CGEC) and Chinese Spelling Check (CSC) tasks, which are two main Chinese Text Correction scenarios. Additionally, we also fine-tune LLMs for Chinese Text Correction to better observe the potential capabilities of LLMs. From extensive analyses and comparisons with previous state-of-the-art small models, we empirically find that the LLMs currently have both amazing performance and unsatisfactory behavior for Chinese Text Correction. We believe our findings will promote the landing and application of LLMs in the Chinese NLP community.

相關內容

大語言模型

關注 55

大(da)語言(yan)模(mo)(mo)型(xing)(xing)是(shi)基(ji)于海量(liang)文(wen)本數據訓練(lian)的(de)(de)(de)(de)(de)(de)深度學習模(mo)(mo)型(xing)(xing)。它(ta)不(bu)僅能(neng)夠生成(cheng)自(zi)然語言(yan)文(wen)本，還能(neng)夠深入(ru)理解(jie)文(wen)本含義，處理各種自(zi)然語言(yan)任務(wu)(wu)，如文(wen)本摘要(yao)、問答、翻譯等。2023年(nian)，大(da)語言(yan)模(mo)(mo)型(xing)(xing)及(ji)其(qi)在人工智能(neng)領域的(de)(de)(de)(de)(de)(de)應用已(yi)成(cheng)為(wei)全球科技(ji)研究的(de)(de)(de)(de)(de)(de)熱點，其(qi)在規模(mo)(mo)上的(de)(de)(de)(de)(de)(de)增長尤為(wei)引人注目，參(can)數量(liang)已(yi)從最初的(de)(de)(de)(de)(de)(de)十幾億躍升到如今的(de)(de)(de)(de)(de)(de)一萬億。參(can)數量(liang)的(de)(de)(de)(de)(de)(de)提(ti)(ti)升使得模(mo)(mo)型(xing)(xing)能(neng)夠更加(jia)精(jing)細(xi)地(di)捕(bu)捉人類語言(yan)微妙之處，更加(jia)深入(ru)地(di)理解(jie)人類語言(yan)的(de)(de)(de)(de)(de)(de)復雜(za)性。在過去(qu)的(de)(de)(de)(de)(de)(de)一年(nian)里(li)，大(da)語言(yan)模(mo)(mo)型(xing)(xing)在吸(xi)納新知識、分解(jie)復雜(za)任務(wu)(wu)以及(ji)圖(tu)文(wen)對(dui)齊等多方面都有顯著提(ti)(ti)升。隨(sui)著技(ji)術的(de)(de)(de)(de)(de)(de)不(bu)斷成(cheng)熟，它(ta)將不(bu)斷拓展其(qi)應用范圍，為(wei)人類提(ti)(ti)供更加(jia)智能(neng)化(hua)和個性化(hua)的(de)(de)(de)(de)(de)(de)服務(wu)(wu)，進(jin)一步改(gai)善人們的(de)(de)(de)(de)(de)(de)生活(huo)和生產方式。

語言模型化 · 大語言模型 · MoDELS · 推斷 · 可約的 ·

2024 年 2 月 1 日

Enhancing Ethical Explanations of Large Language Models through Iterative Symbolic Refinement

Xin Quan,Marco Valentino,Louise A. Dennis,André Freitas

from arxiv, Camera-ready for EACL 2024

An increasing amount of research in Natural Language Inference (NLI) focuses on the application and evaluation of Large Language Models (LLMs) and their reasoning capabilities. Despite their success, however, LLMs are still prone to factual errors and inconsistencies in their explanations, offering limited control and interpretability for inference in complex domains. In this paper, we focus on ethical NLI, investigating how hybrid neuro-symbolic techniques can enhance the logical validity and alignment of ethical explanations produced by LLMs. Specifically, we present an abductive-deductive framework named Logic-Explainer, which integrates LLMs with an external backward-chaining solver to refine step-wise natural language explanations and jointly verify their correctness, reduce incompleteness and minimise redundancy. An extensive empirical analysis demonstrates that Logic-Explainer can improve explanations generated via in-context learning methods and Chain-of-Thought (CoT) on challenging ethical NLI tasks, while, at the same time, producing formal proofs describing and supporting models' reasoning. As ethical NLI requires commonsense reasoning to identify underlying moral violations, our results suggest the effectiveness of neuro-symbolic methods for multi-step NLI more broadly, opening new opportunities to enhance the logical consistency, reliability, and alignment of LLMs.

語言模型化 · 大語言模型 · MoDELS · LLaMA · 數據預處理 ·

2024 年 2 月 1 日

Comparative Study of Large Language Model Architectures on Frontier

Junqi Yin,Avishek Bose,Guojing Cong,Isaac Lyngaas,Quentin Anthony

Large language models (LLMs) have garnered significant attention in both the AI community and beyond. Among these, the Generative Pre-trained Transformer (GPT) has emerged as the dominant architecture, spawning numerous variants. However, these variants have undergone pre-training under diverse conditions, including variations in input data, data preprocessing, and training methodologies, resulting in a lack of controlled comparative studies. Here we meticulously examine two prominent open-sourced GPT architectures, GPT-NeoX and LLaMA, leveraging the computational power of Frontier, the world's first Exascale supercomputer. Employing the same materials science text corpus and a comprehensive end-to-end pipeline, we conduct a comparative analysis of their training and downstream performance. Our efforts culminate in achieving state-of-the-art performance on a challenging materials science benchmark. Furthermore, we investigate the computation and energy efficiency, and propose a computationally efficient method for architecture design. To our knowledge, these pre-trained models represent the largest available for materials science. Our findings provide practical guidance for building LLMs on HPC platforms.

SAT · 約束 · 類別 · SimPLe · 線性的 ·

2024 年 2 月 1 日

Hardness of Random Reordered Encodings of Parity for Resolution and CDCL

Leroy Chew,Alexis de Colnet,Friedrich Slivovsky,Stefan Szeider

Parity reasoning is challenging for Conflict-Driven Clause Learning (CDCL) SAT solvers. This has been observed even for simple formulas encoding two contradictory parity constraints with different variable orders (Chew and Heule 2020). We provide an analytical explanation for their hardness by showing that they require exponential resolution refutations with high probability when the variable order is chosen at random. We obtain this result by proving that these formulas, which are known to be Tseitin formulas, have Tseitin graphs of linear treewidth with high probability. Since such Tseitin formulas require exponential resolution proofs, our result follows. We generalize this argument to a new class of formulas that capture a basic form of parity reasoning involving a sum of two random parity constraints with random orders. Even when the variable order for the sum is chosen favorably, these formulas remain hard for resolution. In contrast, we prove that they have short DRAT refutations. We show experimentally that the running time of CDCL SAT solvers on both classes of formulas grows exponentially with their treewidth.

DNS · WEB · 情景 · Continuity · Amazon ·

2024 年 1 月 30 日

Measuring the Consolidation of DNS and Web Hosting Providers

Synthia Wang,Kyle MacMillan,Brennan Schaffner,Nick Feamster,Marshini Chetty

Despite the Internet's continued growth, it increasingly depends on a small set of service providers to support Domain Name System (DNS) and web content hosting. This trend poses many potential threats including susceptibility to outages, failures, and potential censorship by providers. This paper aims to quantify consolidation in terms of popular domains' reliance on a small set of organizations for both DNS and web hosting. We highlight the extent to which a set of relatively few platforms host the authoritative name servers and web content for the top million websites. Our results show that both DNS and web hosting are concentrated, with Cloudflare and Amazon hosting over $30\%$ of the domains for both services. With the addition of Akamai, Fastly, and Google, these five organizations host $60\%$ of index pages in the Tranco top 10K, as well as the majority of external page resources. These trends are consistent across six different global vantage points, indicating that consolidation is happening globally and popular organizations can influence users' online experience across the world.

Performer · 數據集 · 馬哈拉諾比斯距離 · 可約的 · 評論員 ·

2024 年 1 月 30 日

Evaluation of Out-of-Distribution Detection Performance on Autonomous Driving Datasets

Jens Henriksson,Christian Berger,Stig Ursing,Markus Borg

from arxiv, Preprint to 2023 IEEE International Conference On Artificial Intelligence Testing

Safety measures need to be systemically investigated to what extent they evaluate the intended performance of Deep Neural Networks (DNNs) for critical applications. Due to a lack of verification methods for high-dimensional DNNs, a trade-off is needed between accepted performance and handling of out-of-distribution (OOD) samples. This work evaluates rejecting outputs from semantic segmentation DNNs by applying a Mahalanobis distance (MD) based on the most probable class-conditional Gaussian distribution for the predicted class as an OOD score. The evaluation follows three DNNs trained on the Cityscapes dataset and tested on four automotive datasets and finds that classification risk can drastically be reduced at the cost of pixel coverage, even when applied on unseen datasets. The applicability of our findings will support legitimizing safety measures and motivate their usage when arguing for safe usage of DNNs in automotive perception.

語言模型化 · 大語言模型 · Engineering · MoDELS · 代碼 ·

2024 年 1 月 30 日

Novel Preprocessing Technique for Data Embedding in Engineering Code Generation Using Large Language Model

Yu-Chen Lin,Akhilesh Kumar,Norman Chang,Wenliang Zhang,Muhammad Zakir,Rucha Apte,Haiyang He,Chao Wang,Jyh-Shing Roger Jang

We present four main contributions to enhance the performance of Large Language Models (LLMs) in generating domain-specific code: (i) utilizing LLM-based data splitting and data renovation techniques to improve the semantic representation of embeddings' space; (ii) introducing the Chain of Density for Renovation Credibility (CoDRC), driven by LLMs, and the Adaptive Text Renovation (ATR) algorithm for assessing data renovation reliability; (iii) developing the Implicit Knowledge Expansion and Contemplation (IKEC) Prompt technique; and (iv) effectively refactoring existing scripts to generate new and high-quality scripts with LLMs. By using engineering simulation software RedHawk-SC as a case study, we demonstrate the effectiveness of our data pre-processing method for expanding and categorizing scripts. When combined with IKEC, these techniques enhance the Retrieval-Augmented Generation (RAG) method in retrieving more relevant information, ultimately achieving a 73.33% "Percentage of Correct Lines" for code generation problems in MapReduce applications.

TOOLS · AI · 語言模型化 · 大語言模型 · React ·

2024 年 1 月 30 日

Perceptions and Detection of AI Use in Manuscript Preparation for Academic Journals

Nir Chemaya,Daniel Martin

The emergent abilities of Large Language Models (LLMs), which power tools like ChatGPT and Bard, have produced both excitement and worry about how AI will impact academic writing. In response to rising concerns about AI use, authors of academic publications may decide to voluntarily disclose any AI tools they use to revise their manuscripts, and journals and conferences could begin mandating disclosure and/or turn to using detection services, as many teachers have done with student writing in class settings. Given these looming possibilities, we investigate whether academics view it as necessary to report AI use in manuscript preparation and how detectors react to the use of AI in academic writing.

Automator · 正則的 · MoDELS · 多樣性 · 生成模型 ·

2024 年 1 月 28 日

PPM: Automated Generation of Diverse Programming Problems for Benchmarking Code Generation Models

Simin Chen,Xiaoning Feng,Xiaohong Han,Cong Liu,Wei Yang

from arxiv, This paper has been accepted to The ACM International Conference on the Foundations of Software Engineering FSE 2024

In recent times, a plethora of Large Code Generation Models (LCGMs) have been proposed, showcasing significant potential in assisting developers with complex programming tasks. Benchmarking LCGMs necessitates the creation of a set of diverse programming problems, and each problem comprises the prompt (including the task description), canonical solution, and test inputs. The existing methods for constructing such a problem set can be categorized into two main types: manual methods and perturbation-based methods. However, manual methods demand high effort and lack scalability, while also risking data integrity due to LCGMs' potentially contaminated data collection, and perturbation-based approaches mainly generate semantically homogeneous problems with the same canonical solutions and introduce typos that can be easily auto-corrected by IDE, making them ineffective and unrealistic. In this work, we propose the idea of programming problem merging (PPM) and provide two implementation of this idea, we utilize our tool on two widely-used datasets and compare it against nine baseline methods using eight code generation models. The results demonstrate the effectiveness of our tool in generating more challenging, diverse, and natural programming problems, comparing to the baselines.

search engine · Engineering · 有偏 · 講稿 · Performer ·

2024 年 1 月 27 日

Navigating the Post-API Dilemma Search Engine Results Pages Present a Biased View of Social Media Data

Amrit Poudel,Tim Weninger

Recent decisions to discontinue access to social media APIs are having detrimental effects on Internet research and the field of computational social science as a whole. This lack of access to data has been dubbed the Post-API era of Internet research. Fortunately, popular search engines have the means to crawl, capture, and surface social media data on their Search Engine Results Pages (SERP) if provided the proper search query, and may provide a solution to this dilemma. In the present work we ask: does SERP provide a complete and unbiased sample of social media data? Is SERP a viable alternative to direct API-access? To answer these questions, we perform a comparative analysis between (Google) SERP results and nonsampled data from Reddit and Twitter/X. We find that SERP results are highly biased in favor of popular posts; against political, pornographic, and vulgar posts; are more positive in their sentiment; and have large topical gaps. Overall, we conclude that SERP is not a viable alternative to social media API access.

圖 · Networking · 學成 · Performer · 深度學習 ·

2020 年 10 月 9 日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Emanuele Rossi,Ben Chamberlain,Fabrizio Frasca,Davide Eynard,Federico Monti,Michael Bronstein

Graph Neural Networks (GNNs) have recently become increasingly popular due to their ability to learn complex systems of relations or interactions arising in a broad spectrum of problems ranging from biology and particle physics to social networks and recommendation systems. Despite the plethora of different models for deep learning on graphs, few approaches have been proposed thus far for dealing with graphs that present some sort of dynamic nature (e.g. evolving features or connectivity over time). In this paper, we present Temporal Graph Networks (TGNs), a generic, efficient framework for deep learning on dynamic graphs represented as sequences of timed events. Thanks to a novel combination of memory modules and graph-based operators, TGNs are able to significantly outperform previous approaches being at the same time more computationally efficient. We furthermore show that several previous models for learning on dynamic graphs can be cast as specific instances of our framework. We perform a detailed ablation study of different components of our framework and devise the best configuration that achieves state-of-the-art performance on several transductive and inductive prediction tasks for dynamic graphs.