四虎亚洲精品高清在线观看-秋霞网一区二区三区

The emergence of accurate open large language models (LLMs) has led to a race towards quantization techniques for such models enabling execution on end-user devices. In this paper, we revisit the problem of "extreme" LLM compression--defined as targeting extremely low bit counts, such as 2 to 3 bits per parameter, from the point of view of classic methods in Multi-Codebook Quantization (MCQ). Our work builds on top of Additive Quantization, a classic algorithm from the MCQ family, and adapts it to the quantization of language models. The resulting algorithm advances the state-of-the-art in LLM compression, outperforming all recently-proposed techniques in terms of accuracy at a given compression budget. For instance, when compressing Llama 2 models to 2 bits per parameter, our algorithm quantizes the 7B model to 6.93 perplexity (a 1.29 improvement relative to the best prior work, and 1.81 points from FP16), the 13B model to 5.70 perplexity (a .36 improvement) and the 70B model to 3.94 perplexity (a .22 improvement) on WikiText2. We release our implementation of Additive Quantization for Language Models AQLM as a baseline to facilitate future research in LLM quantization.

相關內容

大語言模型

關注 55

大語言模型是基于海量文本數據訓練的深度學習模型。它不僅能夠生成自然語言文本，還能夠深入理解文本含義，處理各種自然語言任務，如文本摘要、問答、翻譯等。2023年，大語言模型及其在人工智能領域的應用已成為全球科技研究的熱點，其在規模上的增長尤為引人注目，參數量已從最初的十幾億躍升到如今的一萬億。參數量的提升使得模型能夠更加精細地捕捉人類語言微妙之處，更加深入地理解人類語言的復雜性。在過去的一年里，大語言模型在吸納新知識、分解復雜任務以及圖文對齊等多方面都有顯著提升。隨著技術的不斷成熟，它將不斷拓展其應用范圍，為人類提供更加智能化和個性化的服務，進一步改善人們的生活和生產方式。

穩健性 · 樣本復雜度 · 線性的 · Markov · Processing（編程語言） ·

2024 年 3 月 19 日

Sample Complexity of Offline Distributionally Robust Linear Markov Decision Processes

He Wang,Laixi Shi,Yuejie Chi

from arxiv, under review

In offline reinforcement learning (RL), the absence of active exploration calls for attention on the model robustness to tackle the sim-to-real gap, where the discrepancy between the simulated and deployed environments can significantly undermine the performance of the learned policy. To endow the learned policy with robustness in a sample-efficient manner in the presence of high-dimensional state-action space, this paper considers the sample complexity of distributionally robust linear Markov decision processes (MDPs) with an uncertainty set characterized by the total variation distance using offline data. We develop a pessimistic model-based algorithm and establish its sample complexity bound under minimal data coverage assumptions, which outperforms prior art by at least $\tilde{O}(d)$, where $d$ is the feature dimension. We further improve the performance guarantee of the proposed algorithm by incorporating a carefully-designed variance estimator.

Processing（編程語言） · MoDELS · 可交換的 · 泛函 · 核化 ·

2024 年 3 月 18 日

Limit Order Book Dynamics and Order Size Modelling Using Compound Hawkes Process

Konark Jain,Nick Firoozye,Jonathan Kochems,Philip Treleaven

from arxiv, Presented at Market Microstructure 2023, Accepted at Quantitative Finance Workshop 2024. To be submitted for publication to a journal

Hawkes Process has been used to model Limit Order Book (LOB) dynamics in several ways in the literature however the focus has been limited to capturing the inter-event times while the order size is usually assumed to be constant. We propose a novel methodology of using Compound Hawkes Process for the LOB where each event has an order size sampled from a calibrated distribution. The process is formulated in a novel way such that the spread of the process always remains positive. Further, we condition the model parameters on time of day to support empirical observations. We make use of an enhanced non-parametric method to calibrate the Hawkes kernels and allow for inhibitory cross-excitation kernels. We showcase the results and quality of fits for an equity stock's LOB in the NASDAQ exchange and compare them against several baselines. Finally, we conduct a market impact study of the simulator and show the empirical observation of a concave market impact function is indeed replicated.

優化器 · 基向量 · Kronecker積 · 向量化 · 泛化理論 ·

2024 年 3 月 18 日

Compositional Kronecker Context Optimization for Vision-Language Models

Kun Ding,Xiaohui Li,Qiang Yu,Ying Wang,Haojian Zhang,Shiming Xiang

Context Optimization (CoOp) has emerged as a simple yet effective technique for adapting CLIP-like vision-language models to downstream image recognition tasks. Nevertheless, learning compact context with satisfactory base-to-new, domain and cross-task generalization ability while adapting to new tasks is still a challenge. To tackle such a challenge, we propose a lightweight yet generalizable approach termed Compositional Kronecker Context Optimization (CK-CoOp). Technically, the prompt's context words in CK-CoOp are learnable vectors, which are crafted by linearly combining base vectors sourced from a dictionary. These base vectors consist of a non-learnable component obtained by quantizing the weights in the token embedding layer, and a learnable component constructed by applying Kronecker product on several learnable tiny matrices. Intuitively, the compositional structure mitigates the risk of overfitting on training data by remembering more pre-trained knowledge. Meantime, the Kronecker product breaks the non-learnable restrictions of the dictionary, thereby enhancing representation ability with minimal additional parameters. Extensive experiments confirm that CK-CoOp achieves state-of-the-art performance under base-to-new, domain and cross-task generalization evaluation, but also has the metrics of fewer learnable parameters and efficient training and inference speed.

語言模型化 · 大語言模型 · 代碼 · MoDELS · ChatGPT ·

2024 年 3 月 16 日

Can ChatGPT Support Developers? An Empirical Evaluation of Large Language Models for Code Generation

Kailun Jin,Chung-Yu Wang,Hung Viet Pham,Hadi Hemmati

from arxiv, 4 pages, 3 figures, 21st International Conference on Mining Software Repositories (MSR '24), April 15-16, 2024, Lisbon, Portugal

Large language models (LLMs) have demonstrated notable proficiency in code generation, with numerous prior studies showing their promising capabilities in various development scenarios. However, these studies mainly provide evaluations in research settings, which leaves a significant gap in understanding how effectively LLMs can support developers in real-world. To address this, we conducted an empirical analysis of conversations in DevGPT, a dataset collected from developers' conversations with ChatGPT (captured with the Share Link feature on platforms such as GitHub). Our empirical findings indicate that the current practice of using LLM-generated code is typically limited to either demonstrating high-level concepts or providing examples in documentation, rather than to be used as production-ready code. These findings indicate that there is much future work needed to improve LLMs in code generation before they can be integral parts of modern software development.

大語言模型 · MoDELS · INFORMS · 輸出 · 估計/估計量 ·

2024 年 3 月 15 日

Logits of API-Protected LLMs Leak Proprietary Information

Matthew Finlayson,Xiang Ren,Swabha Swayamdipta

The commercialization of large language models (LLMs) has led to the common practice of high-level API-only access to proprietary models. In this work, we show that even with a conservative assumption about the model architecture, it is possible to learn a surprisingly large amount of non-public information about an API-protected LLM from a relatively small number of API queries (e.g., costing under $1,000 for OpenAI's gpt-3.5-turbo). Our findings are centered on one key observation: most modern LLMs suffer from a softmax bottleneck, which restricts the model outputs to a linear subspace of the full output space. We show that this lends itself to a model image or a model signature which unlocks several capabilities with affordable cost: efficiently discovering the LLM's hidden size, obtaining full-vocabulary outputs, detecting and disambiguating different model updates, identifying the source LLM given a single full LLM output, and even estimating the output layer parameters. Our empirical investigations show the effectiveness of our methods, which allow us to estimate the embedding size of OpenAI's gpt-3.5-turbo to be about 4,096. Lastly, we discuss ways that LLM providers can guard against these attacks, as well as how these capabilities can be viewed as a feature (rather than a bug) by allowing for greater transparency and accountability.

Learning · AI · MoDELS · AIM · 評論員 ·

2023 年 9 月 21 日

A Comprehensive Review on Financial Explainable AI

Wei Jie Yeo,Wihan van der Heever,Rui Mao,Erik Cambria,Ranjan Satapathy,Gianmarco Mengaldo

The success of artificial intelligence (AI), and deep learning models in particular, has led to their widespread adoption across various industries due to their ability to process huge amounts of data and learn complex patterns. However, due to their lack of explainability, there are significant concerns regarding their use in critical sectors, such as finance and healthcare, where decision-making transparency is of paramount importance. In this paper, we provide a comparative survey of methods that aim to improve the explainability of deep learning models within the context of finance. We categorize the collection of explainable AI methods according to their corresponding characteristics, and we review the concerns and challenges of adopting explainable AI methods, together with future directions we deemed appropriate and important.

語言模型化 · MoDELS · 泛化理論 · 可辨認的 · Continuity ·

2023 年 7 月 12 日

A Comprehensive Overview of Large Language Models

Humza Naveed,Asad Ullah Khan,Shi Qiu,Muhammad Saqib,Saeed Anwar,Muhammad Usman,Nick Barnes,Ajmal Mian

Large Language Models (LLMs) have shown excellent generalization capabilities that have led to the development of numerous models. These models propose various new architectures, tweaking existing architectures with refined training strategies, increasing context length, using high-quality training data, and increasing training time to outperform baselines. Analyzing new developments is crucial for identifying changes that enhance training stability and improve generalization in LLMs. This survey paper comprehensively analyses the LLMs architectures and their categorization, training strategies, training datasets, and performance evaluations and discusses future research directions. Moreover, the paper also discusses the basic building blocks and concepts behind LLMs, followed by a complete overview of LLMs, including their important features and functions. Finally, the paper summarizes significant findings from LLM research and consolidates essential architectural and training strategies for developing advanced LLMs. Given the continuous advancements in LLMs, we intend to regularly update this paper by incorporating new sections and featuring the latest LLM models.

道德化 · 極小點 · Agent · Continuity · MoDELS ·

2023 年 7 月 2 日

Minimum Levels of Interpretability for Artificial Moral Agents

Avish Vijayaraghavan,Cosmin Badea

As artificial intelligence (AI) models continue to scale up, they are becoming more capable and integrated into various forms of decision-making systems. For models involved in moral decision-making, also known as artificial moral agents (AMA), interpretability provides a way to trust and understand the agent's internal reasoning mechanisms for effective use and error correction. In this paper, we provide an overview of this rapidly-evolving sub-field of AI interpretability, introduce the concept of the Minimum Level of Interpretability (MLI) and recommend an MLI for various types of agents, to aid their safe deployment in real-world settings.

GNN · 圖形處理器 · Neural Networks · Networking · 優化器 ·

2022 年 11 月 11 日

A Comprehensive Survey on Distributed Training of Graph Neural Networks

Haiyang Lin,Mingyu Yan,Xiaochun Ye,Dongrui Fan,Shirui Pan,Wenguang Chen,Yuan Xie

from arxiv, 30 pages, double column, 10 figures, 10 tables

Graph neural networks (GNNs) have been demonstrated to be a powerful algorithmic model in broad application fields for their effectiveness in learning over graphs. To scale GNN training up for large-scale and ever-growing graphs, the most promising solution is distributed training which distributes the workload of training across multiple computing nodes. However, the workflows, computational patterns, communication patterns, and optimization techniques of distributed GNN training remain preliminarily understood. In this paper, we provide a comprehensive survey of distributed GNN training by investigating various optimization techniques used in distributed GNN training. First, distributed GNN training is classified into several categories according to their workflows. In addition, their computational patterns and communication patterns, as well as the optimization techniques proposed by recent work are introduced. Second, the software frameworks and hardware platforms of distributed GNN training are also introduced for a deeper understanding. Third, distributed GNN training is compared with distributed training of deep neural networks, emphasizing the uniqueness of distributed GNN training. Finally, interesting issues and opportunities in this field are discussed.

圖像分割 · 代價 · Performer · SCAN · Better ·

2018 年 1 月 31 日

Improved Image Segmentation via Cost Minimization of Multiple Hypotheses

Marc Bosch,Christopher M. Gifford,Austin G. Dress,Clare W. Lau,Jeffrey G. Skibo,Gordon A. Christie

from arxiv, Accepted BMVC 17

Image segmentation is an important component of many image understanding systems. It aims to group pixels in a spatially and perceptually coherent manner. Typically, these algorithms have a collection of parameters that control the degree of over-segmentation produced. It still remains a challenge to properly select such parameters for human-like perceptual grouping. In this work, we exploit the diversity of segments produced by different choices of parameters. We scan the segmentation parameter space and generate a collection of image segmentation hypotheses (from highly over-segmented to under-segmented). These are fed into a cost minimization framework that produces the final segmentation by selecting segments that: (1) better describe the natural contours of the image, and (2) are more stable and persistent among all the segmentation hypotheses. We compare our algorithm's performance with state-of-the-art algorithms, showing that we can achieve improved results. We also show that our framework is robust to the choice of segmentation kernel that produces the initial set of hypotheses.