国产综合欧美日韩激情在线_韩国成年性午夜免费视频_一级A爱片视频在线观看国产_欧美日韩国产综合视频在线观看一级乱黄_午夜精品久久久久福利电影网_无码不卡亚洲成A人片_你懂的日韩在线视频免费观看

This paper presents a comprehensive comparative analysis of Large Language Models (LLMs) for generation of code documentation. Code documentation is an essential part of the software writing process. The paper evaluates models such as GPT-3.5, GPT-4, Bard, Llama2, and Starchat on various parameters like Accuracy, Completeness, Relevance, Understandability, Readability and Time Taken for different levels of code documentation. Our evaluation employs a checklist-based system to minimize subjectivity, providing a more objective assessment. We find that, barring Starchat, all LLMs consistently outperform the original documentation. Notably, closed-source models GPT-3.5, GPT-4, and Bard exhibit superior performance across various parameters compared to open-source/source-available LLMs, namely LLama 2 and StarChat. Considering the time taken for generation, GPT-4 demonstrated the longest duration, followed by Llama2, Bard, with ChatGPT and Starchat having comparable generation times. Additionally, file level documentation had a considerably worse performance across all parameters (except for time taken) as compared to inline and function level documentation.

相關內容

大語言模型

關注 56

大(da)語(yu)(yu)言(yan)(yan)模(mo)型(xing)是基于海量(liang)文(wen)(wen)(wen)本數(shu)(shu)據(ju)訓練的(de)(de)(de)(de)深(shen)(shen)度學(xue)習模(mo)型(xing)。它不僅(jin)能(neng)(neng)夠生成自然語(yu)(yu)言(yan)(yan)文(wen)(wen)(wen)本，還能(neng)(neng)夠深(shen)(shen)入理(li)解(jie)文(wen)(wen)(wen)本含義，處理(li)各種自然語(yu)(yu)言(yan)(yan)任務，如文(wen)(wen)(wen)本摘要、問(wen)答(da)、翻譯等(deng)(deng)。2023年，大(da)語(yu)(yu)言(yan)(yan)模(mo)型(xing)及其在(zai)人(ren)(ren)工智(zhi)能(neng)(neng)領(ling)域的(de)(de)(de)(de)應用已成為全(quan)球科技(ji)研究的(de)(de)(de)(de)熱點(dian)，其在(zai)規模(mo)上的(de)(de)(de)(de)增長尤為引(yin)人(ren)(ren)注目(mu)，參數(shu)(shu)量(liang)已從(cong)最(zui)初(chu)的(de)(de)(de)(de)十幾億躍升(sheng)(sheng)到(dao)如今的(de)(de)(de)(de)一(yi)萬億。參數(shu)(shu)量(liang)的(de)(de)(de)(de)提升(sheng)(sheng)使(shi)得模(mo)型(xing)能(neng)(neng)夠更加(jia)(jia)精細地(di)捕捉人(ren)(ren)類語(yu)(yu)言(yan)(yan)微妙之處，更加(jia)(jia)深(shen)(shen)入地(di)理(li)解(jie)人(ren)(ren)類語(yu)(yu)言(yan)(yan)的(de)(de)(de)(de)復雜(za)性。在(zai)過去的(de)(de)(de)(de)一(yi)年里，大(da)語(yu)(yu)言(yan)(yan)模(mo)型(xing)在(zai)吸納新知識、分解(jie)復雜(za)任務以及圖(tu)文(wen)(wen)(wen)對齊等(deng)(deng)多方面都有顯(xian)著(zhu)提升(sheng)(sheng)。隨著(zhu)技(ji)術的(de)(de)(de)(de)不斷成熟，它將(jiang)不斷拓展其應用范(fan)圍，為人(ren)(ren)類提供更加(jia)(jia)智(zhi)能(neng)(neng)化(hua)和(he)個性化(hua)的(de)(de)(de)(de)服務，進一(yi)步(bu)改善人(ren)(ren)們的(de)(de)(de)(de)生活和(he)生產(chan)方式。

縮放 · BLEU · Performer · 語言模型化 · 得分 ·

2024 年 2 月 6 日

Scaling Laws for Downstream Task Performance of Large Language Models

Berivan Isik,Natalia Ponomareva,Hussein Hazimeh,Dimitris Paparas,Sergei Vassilvitskii,Sanmi Koyejo

Scaling laws provide important insights that can guide the design of large language models (LLMs). Existing work has primarily focused on studying scaling laws for pretraining (upstream) loss. However, in transfer learning settings, in which LLMs are pretrained on an unsupervised dataset and then finetuned on a downstream task, we often also care about the downstream performance. In this work, we study the scaling behavior in a transfer learning setting, where LLMs are finetuned for machine translation tasks. Specifically, we investigate how the choice of the pretraining data and its size affect downstream performance (translation quality) as judged by two metrics: downstream cross-entropy and BLEU score. Our experiments indicate that the size of the finetuning dataset and the distribution alignment between the pretraining and downstream data significantly influence the scaling behavior. With sufficient alignment, both downstream cross-entropy and BLEU score improve monotonically with more pretraining data. In such cases, we show that it is possible to predict the downstream BLEU score with good accuracy using a log-law. However, there are also cases where moderate misalignment causes the BLEU score to fluctuate or get worse with more pretraining, whereas downstream cross-entropy monotonically improves. By analyzing these observations, we provide new practical insights for choosing appropriate pretraining data.

Integration · 代碼 · INFORMS · TOOLS · 可辨認的 ·

2024 年 2 月 6 日

Enhancing LLM-Based Coding Tools through Native Integration of IDE-Derived Static Context

Yichen Li,Yun Peng,Yintong Huo,Michael R. Lyu

Large Language Models (LLMs) have achieved remarkable success in code completion, as evidenced by their essential roles in developing code assistant services such as Copilot. Being trained on in-file contexts, current LLMs are quite effective in completing code for single source files. However, it is challenging for them to conduct repository-level code completion for large software projects that require cross-file information. Existing research on LLM-based repository-level code completion identifies and integrates cross-file contexts, but it suffers from low accuracy and limited context length of LLMs. In this paper, we argue that Integrated Development Environments (IDEs) can provide direct, accurate and real-time cross-file information for repository-level code completion. We propose IDECoder, a practical framework that leverages IDE native static contexts for cross-context construction and diagnosis results for self-refinement. IDECoder utilizes the rich cross-context information available in IDEs to enhance the capabilities of LLMs of repository-level code completion. We conducted preliminary experiments to validate the performance of IDECoder and observed that this synergy represents a promising trend for future exploration.

ChatGPT · 代碼 · Readability · 模型評估 · Processing（編程語言） ·

2024 年 2 月 5 日

User-Centric Evaluation of ChatGPT Capability of Generating R Program Code

Tanha Miah,Hong Zhu

from arxiv, The paper has been submitted to the journal Electronics for consideration of publication. It is in the review process

This paper reports an evaluation of ChatGPT's capability of generating R programming language code from natural language input. A dataset specially designed for generating R program code was constructed with metadata to support scenario-based testing and evaluation of code generation capabilities in various usage scenarios of different levels of difficulty and different types of programs. The evaluation takes a multiple attempt process in which the tester tries to complete the code generation task through a number of attempts until a satisfactory solution is obtained or gives up after a fixed number of maximal attempts. In each attempt the tester formulates a natural language input to ChatGPT based on the previous results and the task to be completed. In addition to the metrics of average numbers of attempts and average amount of time taken to complete the tasks, the final generated solutions are then assessed on a number of quality attributes, including accuracy, completeness, conciseness, readability, well structuredness, logic clarity, depth of ex-planation, and coverage of parameters. Our experiments demonstrated that ChatGPT is in general highly capable of generating high quality R program code as well as textual explanations although it may fail on hard programming tasks. The experiment data also shows that human developers can hardly learn from experiences naturally to improve the skill of using ChatGPT to generate code.

控制器 · 去噪 · 情景 · 前向 · Processing（編程語言） ·

2024 年 2 月 3 日

Denoising Diffusion-Based Control of Nonlinear Systems

Karthik Elamvazhuthi,Darshan Gadginmath,Fabio Pasqualetti

We propose a novel approach based on Denoising Diffusion Probabilistic Models (DDPMs) to control nonlinear dynamical systems. DDPMs are the state-of-art of generative models that have achieved success in a wide variety of sampling tasks. In our framework, we pose the feedback control problem as a generative task of drawing samples from a target set under control system constraints. The forward process of DDPMs constructs trajectories originating from a target set by adding noise. We learn to control a dynamical system in reverse such that the terminal state belongs to the target set. For control-affine systems without drift, we prove that the control system can exactly track the trajectory of the forward process in reverse, whenever the the Lie bracket based condition for controllability holds. We numerically study our approach on various nonlinear systems and verify our theoretical results. We also conduct numerical experiments for cases beyond our theoretical results on a physics-engine.

粒子群優化算法 · 優化器 · 平滑 · 可行 · 生成方法 ·

2024 年 2 月 2 日

Efficient and Interaction-Aware Trajectory Planning for Autonomous Vehicles with Particle Swarm Optimization

Lin Song,David Isele,Naira Hovakimyan,Sangjae Bae

This paper introduces a novel numerical approach to achieving smooth lane-change trajectories in autonomous driving scenarios. Our trajectory generation approach leverages particle swarm optimization (PSO) techniques, incorporating Neural Network (NN) predictions for trajectory refinement. The generation of smooth and dynamically feasible trajectories for the lane change maneuver is facilitated by combining polynomial curve fitting with particle propagation, which can account for vehicle dynamics. The proposed planning algorithm is capable of determining feasible trajectories with real-time computation capability. We conduct comparative analyses with two baseline methods for lane changing, involving analytic solutions and heuristic techniques in numerical simulations. The simulation results validate the efficacy and effectiveness of our proposed approach.

規范化的 · Analysis · 情景 · 變換 · MoDELS ·

2024 年 2 月 2 日

Two Approaches to Diachronic Normalization of Polish Texts

Kacper Dudzic,Filip Graliński,Krzysztof Jassem,Marek Kubis,Piotr Wierzchoń

from arxiv, Accepted to the LaTeCH-CLfL 2024 workshop

This paper discusses two approaches to the diachronic normalization of Polish texts: a rule-based solution that relies on a set of handcrafted patterns, and a neural normalization model based on the text-to-text transfer transformer architecture. The training and evaluation data prepared for the task are discussed in detail, along with experiments conducted to compare the proposed normalization solutions. A quantitative and qualitative analysis is made. It is shown that at the current stage of inquiry into the problem, the rule-based solution outperforms the neural one on 3 out of 4 variants of the prepared dataset, although in practice both approaches have distinct advantages and disadvantages.

圖形處理器 · INTERACT · Neural Networks · 圖 · Networking ·

2020 年 12 月 22 日

A Hierarchical Reasoning Graph Neural Network for The Automatic Scoring of Answer Transcriptions in Video Job Interviews

Kai Chen,Meng Niu,Qingcai Chen

from arxiv, 9 pages, 2 figures

We address the task of automatically scoring the competency of candidates based on textual features, from the automatic speech recognition (ASR) transcriptions in the asynchronous video job interview (AVI). The key challenge is how to construct the dependency relation between questions and answers, and conduct the semantic level interaction for each question-answer (QA) pair. However, most of the recent studies in AVI focus on how to represent questions and answers better, but ignore the dependency information and interaction between them, which is critical for QA evaluation. In this work, we propose a Hierarchical Reasoning Graph Neural Network (HRGNN) for the automatic assessment of question-answer pairs. Specifically, we construct a sentence-level relational graph neural network to capture the dependency information of sentences in or between the question and the answer. Based on these graphs, we employ a semantic-level reasoning graph attention network to model the interaction states of the current QA session. Finally, we propose a gated recurrent unit encoder to represent the temporal question-answer pairs for the final prediction. Empirical results conducted on CHNAT (a real-world dataset) validate that our proposed model significantly outperforms text-matching based benchmark models. Ablation studies and experimental results with 10 random seeds also show the effectiveness and stability of our models.

圖 · 鏈路預測 · 正交 · 知識圖譜 · Better ·

2020 年 4 月 15 日

Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding

Yun Tang,Jing Huang,Guangtao Wang,Xiaodong He,Bowen Zhou

from arxiv, Accepted by ACL 2020

Translational distance-based knowledge graph embedding has shown progressive improvements on the link prediction task, from TransE to the latest state-of-the-art RotatE. However, N-1, 1-N and N-N predictions still remain challenging. In this work, we propose a novel translational distance-based approach for knowledge graph link prediction. The proposed method includes two-folds, first we extend the RotatE from 2D complex domain to high dimension space with orthogonal transforms to model relations for better modeling capacity. Second, the graph context is explicitly modeled via two directed context representations. These context representations are used as part of the distance scoring function to measure the plausibility of the triples during training and inference. The proposed approach effectively improves prediction accuracy on the difficult N-1, 1-N and N-N cases for knowledge graph link prediction task. The experimental results show that it achieves better performance on two benchmark data sets compared to the baseline RotatE, especially on data set (FB15k-237) with many high in-degree connection nodes.

命名實體識別 · entity · 學成 · 深度學習 · 可辨認的 ·

2020 年 3 月 13 日

A Survey on Deep Learning for Named Entity Recognition

Jing Li,Aixin Sun,Jianglei Han,Chenliang Li

from arxiv, 20 pages, 12 figures, 3 tables. arXiv admin note: text overlap with arXiv:1702.02098, arXiv:1904.10503 by other authors

Named entity recognition (NER) is the task to identify text spans that mention named entities, and to classify them into predefined categories such as person, location, organization etc. NER serves as the basis for a variety of natural language applications such as question answering, text summarization, and machine translation. Although early NER systems are successful in producing decent recognition accuracy, they often require much human effort in carefully designing rules or features. In recent years, deep learning, empowered by continuous real-valued vector representations and semantic composition through nonlinear processing, has been employed in NER systems, yielding stat-of-the-art performance. In this paper, we provide a comprehensive review on existing deep learning techniques for NER. We first introduce NER resources, including tagged NER corpora and off-the-shelf NER tools. Then, we systematically categorize existing works based on a taxonomy along three axes: distributed representations for input, context encoder, and tag decoder. Next, we survey the most representative methods for recent applied techniques of deep learning in new NER problem settings and applications. Finally, we present readers with the challenges faced by NER systems and outline future directions in this area.

contrastive · 對比學習 · 學成 · SimPLe · SimCLR ·

2020 年 2 月 13 日

A Simple Framework for Contrastive Learning of Visual Representations

Ting Chen,Simon Kornblith,Mohammad Norouzi,Geoffrey Hinton

This paper presents SimCLR: a simple framework for contrastive learning of visual representations. We simplify recently proposed contrastive self-supervised learning algorithms without requiring specialized architectures or a memory bank. In order to understand what enables the contrastive prediction tasks to learn useful representations, we systematically study the major components of our framework. We show that (1) composition of data augmentations plays a critical role in defining effective predictive tasks, (2) introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and (3) contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning. By combining these findings, we are able to considerably outperform previous methods for self-supervised and semi-supervised learning on ImageNet. A linear classifier trained on self-supervised representations learned by SimCLR achieves 76.5% top-1 accuracy, which is a 7% relative improvement over previous state-of-the-art, matching the performance of a supervised ResNet-50. When fine-tuned on only 1% of the labels, we achieve 85.8% top-5 accuracy, outperforming AlexNet with 100X fewer labels.