亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

·

Machine Learning · 語言模型化 · Learning · 代碼 · MoDELS ·

2023 年 11 月 16 日

ML-Bench: Large Language Models Leverage Open-source Libraries for Machine Learning Tasks

Yuliang Liu,Xiangru Tang,Zefan Cai,Junjie Lu,Yichi Zhang,Yanjun Shao,Zexuan Deng,Helan Hu,Zengxian Yang,Kaikai An,Ruijun Huang,Shuzheng Si,Sheng Chen,Haozhe Zhao,Zhengliang Li,Liang Chen,Yiming Zong,Yan Wang,Tianyu Liu,Zhiwei Jiang,Baobao Chang,Yujia Qin,Wangchunshu Zhou,Yilun Zhao,Arman Cohan,Mark Gerstein

Large language models have shown promising performance in code generation benchmarks. However, a considerable divide exists between these benchmark achievements and their practical applicability, primarily attributed to real-world programming's reliance on pre-existing libraries. Instead of evaluating LLMs to code from scratch, this work aims to propose a new evaluation setup where LLMs use open-source libraries to finish machine learning tasks. Therefore, we propose ML-Bench, an expansive benchmark developed to assess the effectiveness of LLMs in leveraging existing functions in open-source libraries. Consisting of 10044 samples spanning 130 tasks over 14 notable machine learning GitHub repositories. In this setting, given a specific machine learning task instruction and the accompanying README in a codebase, an LLM is tasked to generate code to accomplish the task. This necessitates the comprehension of long and language-code interleaved documents, as well as the understanding of complex cross-file code structures, introducing new challenges. Notably, while GPT-4 exhibits remarkable improvement over other LLMs, it manages to accomplish only 39.73\% of the tasks, leaving a huge space for improvement. We address these challenges by proposing ML-Agent, designed to effectively navigate the codebase, locate documentation, retrieve code, and generate executable code. Empirical results demonstrate that ML-Agent, built upon GPT-4, results in further improvements. Code, data, and models are available at \url{//ml-bench.github.io/}.

相關內容

Machine Learning

Machine Learning

機器學(xue)(xue)習（Machine Learning）是一個研(yan)究(jiu)(jiu)計算學(xue)(xue)習方(fang)(fang)法的(de)(de)(de)(de)(de)國際論(lun)(lun)(lun)(lun)壇。該雜(za)(za)志(zhi)發表(biao)文(wen)(wen)章，報告廣泛的(de)(de)(de)(de)(de)學(xue)(xue)習方(fang)(fang)法應(ying)用(yong)于(yu)各(ge)種學(xue)(xue)習問題(ti)的(de)(de)(de)(de)(de)實質性結果。該雜(za)(za)志(zhi)的(de)(de)(de)(de)(de)特色(se)論(lun)(lun)(lun)(lun)文(wen)(wen)描述(shu)研(yan)究(jiu)(jiu)的(de)(de)(de)(de)(de)問題(ti)和(he)(he)方(fang)(fang)法，應(ying)用(yong)研(yan)究(jiu)(jiu)和(he)(he)研(yan)究(jiu)(jiu)方(fang)(fang)法的(de)(de)(de)(de)(de)問題(ti)。有關學(xue)(xue)習問題(ti)或方(fang)(fang)法的(de)(de)(de)(de)(de)論(lun)(lun)(lun)(lun)文(wen)(wen)通(tong)過實證(zheng)研(yan)究(jiu)(jiu)、理論(lun)(lun)(lun)(lun)分析或與心理現象的(de)(de)(de)(de)(de)比較提供了(le)堅(jian)實的(de)(de)(de)(de)(de)支持。應(ying)用(yong)論(lun)(lun)(lun)(lun)文(wen)(wen)展示(shi)了(le)如何應(ying)用(yong)學(xue)(xue)習方(fang)(fang)法來解(jie)決(jue)重(zhong)要的(de)(de)(de)(de)(de)應(ying)用(yong)問題(ti)。研(yan)究(jiu)(jiu)方(fang)(fang)法論(lun)(lun)(lun)(lun)文(wen)(wen)改進(jin)了(le)機器學(xue)(xue)習的(de)(de)(de)(de)(de)研(yan)究(jiu)(jiu)方(fang)(fang)法。所有的(de)(de)(de)(de)(de)論(lun)(lun)(lun)(lun)文(wen)(wen)都以(yi)其他(ta)研(yan)究(jiu)(jiu)人員可以(yi)驗證(zheng)或復制的(de)(de)(de)(de)(de)方(fang)(fang)式描述(shu)了(le)支持證(zheng)據(ju)。論(lun)(lun)(lun)(lun)文(wen)(wen)還詳細說明了(le)學(xue)(xue)習的(de)(de)(de)(de)(de)組成部分，并(bing)討論(lun)(lun)(lun)(lun)了(le)關于(yu)知識表(biao)示(shi)和(he)(he)性能任務的(de)(de)(de)(de)(de)假設。官網地址：

簇 · 語言模型化 · MoDELS · 偽標記 · 文本分類 ·

2024 年 1 月 8 日

IDoFew: Intermediate Training Using Dual-Clustering in Language Models for Few Labels Text Classification

Abdullah Alsuhaibani,Hamad Zogan,Imran Razzak,Shoaib Jameel,Guandong Xu

from arxiv, Published in The 17th ACM International Conference on Web Search and Data Mining

Language models such as Bidirectional Encoder Representations from Transformers (BERT) have been very effective in various Natural Language Processing (NLP) and text mining tasks including text classification. However, some tasks still pose challenges for these models, including text classification with limited labels. This can result in a cold-start problem. Although some approaches have attempted to address this problem through single-stage clustering as an intermediate training step coupled with a pre-trained language model, which generates pseudo-labels to improve classification, these methods are often error-prone due to the limitations of the clustering algorithms. To overcome this, we have developed a novel two-stage intermediate clustering with subsequent fine-tuning that models the pseudo-labels reliably, resulting in reduced prediction errors. The key novelty in our model, IDoFew, is that the two-stage clustering coupled with two different clustering algorithms helps exploit the advantages of the complementary algorithms that reduce the errors in generating reliable pseudo-labels for fine-tuning. Our approach has shown significant improvements compared to strong comparative models.

MoDELS · 穩健性 · 語言模型化 · 推斷 · Performer ·

2024 年 1 月 7 日

ROIC-DM: Robust Text Inference and Classification via Diffusion Model

Shilong Yuan,Wei Yuan,Tieke HE

from arxiv, aaai2024

While language models have made many milestones in text inference and classification tasks, they remain susceptible to adversarial attacks that can lead to unforeseen outcomes. Existing works alleviate this problem by equipping language models with defense patches. However, these defense strategies often rely on impractical assumptions or entail substantial sacrifices in model performance. Consequently, enhancing the resilience of the target model using such defense mechanisms is a formidable challenge. This paper introduces an innovative model for robust text inference and classification, built upon diffusion models (ROIC-DM). Benefiting from its training involving denoising stages, ROIC-DM inherently exhibits greater robustness compared to conventional language models. Moreover, ROIC-DM can attain comparable, and in some cases, superior performance to language models, by effectively incorporating them as advisory components. Extensive experiments conducted with several strong textual adversarial attacks on three datasets demonstrate that (1) ROIC-DM outperforms traditional language models in robustness, even when the latter are fortified with advanced defense mechanisms; (2) ROIC-DM can achieve comparable and even better performance than traditional language models by using them as advisors.

MoDELS · 簇 · Processing（編程語言） · 蓋樂世（Galaxy） · Performer ·

2024 年 1 月 6 日

Poisson Cluster Process Models for Detecting Ultra-Diffuse Galaxies

Dayi Li,Alex Stringer,Patrick E. Brown,Gwendolyn M. Eadie,Roberto G. Abraham

from arxiv, 58 pages, 8 figures, 3 tables; submitted to AoAS, comments are welcome

We propose a novel set of Poisson Cluster Process (PCP) models to detect Ultra-Diffuse Galaxies (UDGs), a class of extremely faint, enigmatic galaxies of substantial interest in modern astrophysics. We model the unobserved UDG locations as parent points in a PCP, and infer their positions based on the observed spatial point patterns of their old star cluster systems. Many UDGs have somewhere from a few to hundreds of these old star clusters, which we treat as offspring points in our models. We also present a new framework to construct a marked PCP model using the marks of star clusters. The marked PCP model may enhance the detection of UDGs and offers broad applicability to problems in other disciplines. To assess the overall model performance, we design an innovative assessment tool for spatial prediction problems where only point-referenced ground truth is available, overcoming the limitation of standard ROC analyses where spatial Boolean reference maps are required. We construct a bespoke blocked Gibbs adaptive spatial birth-death-move MCMC algorithm to infer the locations of UDGs using real data from a \textit{Hubble Space Telescope} imaging survey. Based on our performance assessment tool, our novel models significantly outperform existing approaches using the Log-Gaussian Cox Process. We also obtained preliminary evidence that the marked PCP model improves UDG detection performance compared to the model without marks. Furthermore, we find evidence of a potential new ``dark galaxy'' that was not detected by previous methods.

代碼 · MoDELS · 可理解性 · 確切的 · 得分 ·

2024 年 1 月 5 日

AST-T5: Structure-Aware Pretraining for Code Generation and Understanding

Linyuan Gong,Mostafa Elhoushi,Alvin Cheung

Large language models (LLMs) have made significant advancements in code-related tasks, yet many LLMs treat code as simple sequences, neglecting its structured nature. We introduce AST-T5, a novel pretraining paradigm that leverages the Abstract Syntax Tree (AST) for enhanced code generation, transpilation, and understanding. Using dynamic programming, our AST-Aware Segmentation retains code structure, while our AST-Aware Span Corruption objective equips the model to reconstruct various code structures. Unlike other models, AST-T5 avoids intricate program analyses or architectural changes, so it integrates seamlessly with any encoder-decoder Transformer. Evaluations show that AST-T5 consistently outperforms similar-sized LMs across various code-related tasks. Structure-awareness makes AST-T5 particularly powerful in code-to-code tasks, surpassing CodeT5 by 2 points in exact match score for the Bugs2Fix task and by 3 points in exact match score for Java-C# Transpilation in CodeXGLUE. Our code and model are publicly available at //github.com/gonglinyuan/ast_t5.

NeRF · 3D · 潛在 · 相互獨立的 · state-of-the-art ·

2024 年 1 月 5 日

FED-NeRF: Achieve High 3D Consistency and Temporal Coherence for Face Video Editing on Dynamic NeRF

Hao Zhang,Yu-Wing Tai,Chi-Keung Tang

from arxiv, Our code will be available at: //github.com/ZHANG1023/FED-NeRF

The success of the GAN-NeRF structure has enabled face editing on NeRF to maintain 3D view consistency. However, achieving simultaneously multi-view consistency and temporal coherence while editing video sequences remains a formidable challenge. This paper proposes a novel face video editing architecture built upon the dynamic face GAN-NeRF structure, which effectively utilizes video sequences to restore the latent code and 3D face geometry. By editing the latent code, multi-view consistent editing on the face can be ensured, as validated by multiview stereo reconstruction on the resulting edited images in our dynamic NeRF. As the estimation of face geometries occurs on a frame-by-frame basis, this may introduce a jittering issue. We propose a stabilizer that maintains temporal coherence by preserving smooth changes of face expressions in consecutive frames. Quantitative and qualitative analyses reveal that our method, as the pioneering 4D face video editor, achieves state-of-the-art performance in comparison to existing 2D or 3D-based approaches independently addressing identity and motion. Codes will be released.

大語言模型 · Analysis · MoDELS · Extensibility · Performer ·

2024 年 1 月 4 日

T-Eval: Evaluating the Tool Utilization Capability Step by Step

Zehui Chen,Weihua Du,Wenwei Zhang,Kuikun Liu,Jiangning Liu,Miao Zheng,Jingming Zhuo,Songyang Zhang,Dahua Lin,Kai Chen,Feng Zhao

from arxiv, Code: //github.com/open-compass/T-Eval; Website: //open-compass.github.io/T-Eval

Large language models (LLM) have achieved remarkable performance on various NLP tasks and are augmented by tools for broader applications. Yet, how to evaluate and analyze the tool-utilization capability of LLMs is still under-explored. In contrast to previous works that evaluate models holistically, we comprehensively decompose the tool utilization into multiple sub-processes, including instruction following, planning, reasoning, retrieval, understanding, and review. Based on that, we further introduce T-Eval to evaluate the tool utilization capability step by step. T-Eval disentangles the tool utilization evaluation into several sub-domains along model capabilities, facilitating the inner understanding of both holistic and isolated competency of LLMs. We conduct extensive experiments on T-Eval and in-depth analysis of various LLMs. T-Eval not only exhibits consistency with the outcome-oriented evaluation but also provides a more fine-grained analysis of the capabilities of LLMs, providing a new perspective in LLM evaluation on tool-utilization ability. The benchmark will be available at //github.com/open-compass/T-Eval.

語言模型化 · Taxonomy · MoDELS · motivation · 評論員 ·

2023 年 5 月 31 日

Beyond One-Model-Fits-All: A Survey of Domain Specialization for Large Language Models

Chen Ling,Xujiang Zhao,Jiaying Lu,Chengyuan Deng,Can Zheng,Junxiang Wang,Tanmoy Chowdhury,Yun Li,Hejie Cui,Xuchao Zhang,Tianjiao Zhao,Amit Panalkar,Wei Cheng,Haoyu Wang,Yanchi Liu,Zhengzhang Chen,Haifeng Chen,Chris White,Quanquan Gu,Carl Yang,Liang Zhao

Large language models (LLMs) have significantly advanced the field of natural language processing (NLP), providing a highly useful, task-agnostic foundation for a wide range of applications. The great promise of LLMs as general task solvers motivated people to extend their functionality largely beyond just a ``chatbot'', and use it as an assistant or even replacement for domain experts and tools in specific domains such as healthcare, finance, and education. However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles, caused by the heterogeneity of domain data, the sophistication of domain knowledge, the uniqueness of domain objectives, and the diversity of the constraints (e.g., various social norms, cultural conformity, religious beliefs, and ethical standards in the domain applications). To fill such a gap, explosively-increase research, and practices have been conducted in very recent years on the domain specialization of LLMs, which, however, calls for a comprehensive and systematic review to better summarizes and guide this promising domain. In this survey paper, first, we propose a systematic taxonomy that categorizes the LLM domain-specialization techniques based on the accessibility to LLMs and summarizes the framework for all the subcategories as well as their relations and differences to each other. We also present a comprehensive taxonomy of critical application domains that can benefit from specialized LLMs, discussing their practical significance and open challenges. Furthermore, we offer insights into the current research status and future trends in this area.

語言模型化 · Performer · Agent · MoDELS · Learning ·

2023 年 5 月 19 日

Introspective Tips: Large Language Model for In-Context Decision Making

Liting Chen,Lu Wang,Hang Dong,Yali Du,Jie Yan,Fangkai Yang,Shuang Li,Pu Zhao,Si Qin,Saravan Rajmohan,Qingwei Lin,Dongmei Zhang

from arxiv, 22 pages, 4 figures

The emergence of large language models (LLMs) has substantially influenced natural language processing, demonstrating exceptional results across various tasks. In this study, we employ ``Introspective Tips" to facilitate LLMs in self-optimizing their decision-making. By introspectively examining trajectories, LLM refines its policy by generating succinct and valuable tips. Our method enhances the agent's performance in both few-shot and zero-shot learning situations by considering three essential scenarios: learning from the agent's past experiences, integrating expert demonstrations, and generalizing across diverse games. Importantly, we accomplish these improvements without fine-tuning the LLM parameters; rather, we adjust the prompt to generalize insights from the three aforementioned situations. Our framework not only supports but also emphasizes the advantage of employing LLM in in-contxt decision-making. Experiments involving over 100 games in TextWorld illustrate the superior performance of our approach.

Pyramid · MoDELS · Extensibility · state-of-the-art · Performer ·

2022 年 12 月 1 日

Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis

Wan-Cyuan Fan,Yen-Chun Chen,Dongdong Chen,Yu Cheng,Lu Yuan,Yu-Chiang Frank Wang

from arxiv, AAAI 2023

Diffusion models (DMs) have shown great potential for high-quality image synthesis. However, when it comes to producing images with complex scenes, how to properly describe both image global structures and object details remains a challenging task. In this paper, we present Frido, a Feature Pyramid Diffusion model performing a multi-scale coarse-to-fine denoising process for image synthesis. Our model decomposes an input image into scale-dependent vector quantized features, followed by a coarse-to-fine gating for producing image output. During the above multi-scale representation learning stage, additional input conditions like text, scene graph, or image layout can be further exploited. Thus, Frido can be also applied for conditional or cross-modality image synthesis. We conduct extensive experiments over various unconditioned and conditional image generation tasks, ranging from text-to-image synthesis, layout-to-image, scene-graph-to-image, to label-to-image. More specifically, we achieved state-of-the-art FID scores on five benchmarks, namely layout-to-image on COCO and OpenImages, scene-graph-to-image on COCO and Visual Genome, and label-to-image on COCO. Code is available at //github.com/davidhalladay/Frido.

語言模型化 · MoDELS · 位置嵌入 · 自編碼器 · 掩碼 ·

2020 年 2 月 28 日

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Hangbo Bao,Li Dong,Furu Wei,Wenhui Wang,Nan Yang,Xiaodong Liu,Yu Wang,Songhao Piao,Jianfeng Gao,Ming Zhou,Hsiao-Wuen Hon

from arxiv, 11 pages

We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks using a novel training procedure, referred to as a pseudo-masked language model (PMLM). Given an input text with masked tokens, we rely on conventional masks to learn inter-relations between corrupted tokens and context via autoencoding, and pseudo masks to learn intra-relations between masked spans via partially autoregressive modeling. With well-designed position embeddings and self-attention masks, the context encodings are reused to avoid redundant computation. Moreover, conventional masks used for autoencoding provide global masking information, so that all the position embeddings are accessible in partially autoregressive language modeling. In addition, the two tasks pre-train a unified language model as a bidirectional encoder and a sequence-to-sequence decoder, respectively. Our experiments show that the unified language models pre-trained using PMLM achieve new state-of-the-art results on a wide range of natural language understanding and generation tasks across several widely used benchmarks.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

Machine Learning

語言(yan)模(mo)型化

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191