姑娘日本电影免费观看全集中文-唯美清纯另类亚洲一区二区

We present a simple functional programming language, called Dual PCF, that implements forward mode automatic differentiation using dual numbers in the framework of exact real number computation. The main new feature of this language is the ability to evaluate correctly up to the precision specified by the user -- in a simple and direct way -- the directional derivative of functionals as well as first order functions. In contrast to other comparable languages, Dual PCF also includes the recursive operator for defining functions and functionals. We provide a wide range of examples of Lipschitz functions and functionals that can be defined in Dual PCF. We use domain theory both to give a denotational semantics to the language and to prove the correctness of the new derivative operator using logical relations. To be able to differentiate functionals -- including on function spaces equipped with their compact-open topology that do not admit a norm -- we develop a domain-theoretic directional derivative that is Scott continuous and extends Clarke's subgradient of real-valued locally Lipschitz maps on Banach spaces to real-valued continuous maps on Hausdorff topological vector spaces. Finally, we show that we can express arbitrary computable linear functionals in Dual PCF.

相關內容

泛函

關注 0

大語言模型 · MoDELS · 語言模型化 · Perplexity · Llama 2 ·

2024 年 1 月 11 日

Extreme Compression of Large Language Models via Additive Quantization

Vage Egiazarian,Andrei Panferov,Denis Kuznedelev,Elias Frantar,Artem Babenko,Dan Alistarh

from arxiv, Preprint

The emergence of accurate open large language models (LLMs) has led to a race towards quantization techniques for such models enabling execution on end-user devices. In this paper, we revisit the problem of "extreme" LLM compression--defined as targeting extremely low bit counts, such as 2 to 3 bits per parameter, from the point of view of classic methods in Multi-Codebook Quantization (MCQ). Our work builds on top of Additive Quantization, a classic algorithm from the MCQ family, and adapts it to the quantization of language models. The resulting algorithm advances the state-of-the-art in LLM compression, outperforming all recently-proposed techniques in terms of accuracy at a given compression budget. For instance, when compressing Llama 2 models to 2 bits per parameter, our algorithm quantizes the 7B model to 6.93 perplexity (a 1.29 improvement relative to the best prior work, and 1.81 points from FP16), the 13B model to 5.70 perplexity (a .36 improvement) and the 70B model to 3.94 perplexity (a .22 improvement) on WikiText2. We release our implementation of Additive Quantization for Language Models AQLM as a baseline to facilitate future research in LLM quantization.

MoDELS · 語言模型化 · 大語言模型 · 表示 · INFORMS ·

2024 年 1 月 11 日

Patchscope: A Unifying Framework for Inspecting Hidden Representations of Language Models

Asma Ghandeharioun,Avi Caciularu,Adam Pearce,Lucas Dixon,Mor Geva

Inspecting the information encoded in hidden representations of large language models (LLMs) can explain models' behavior and verify their alignment with human values. Given the capabilities of LLMs in generating human-understandable text, we propose leveraging the model itself to explain its internal representations in natural language. We introduce a framework called Patchscopes and show how it can be used to answer a wide range of research questions about an LLM's computation. We show that prior interpretability methods based on projecting representations into the vocabulary space and intervening on the LLM computation, can be viewed as special instances of this framework. Moreover, several of their shortcomings such as failure in inspecting early layers or lack of expressivity can be mitigated by a Patchscope. Beyond unifying prior inspection techniques, Patchscopes also opens up new possibilities such as using a more capable model to explain the representations of a smaller model, and unlocks new applications such as self-correction in multi-hop reasoning.

MoDELS · 語言模型化 · 大語言模型 · 最優化 · Learning ·

2024 年 1 月 11 日

Secrets of RLHF in Large Language Models Part II: Reward Modeling

Binghai Wang,Rui Zheng,Lu Chen,Yan Liu,Shihan Dou,Caishuang Huang,Wei Shen,Senjie Jin,Enyu Zhou,Chenyu Shi,Songyang Gao,Nuo Xu,Yuhao Zhou,Xiaoran Fan,Zhiheng Xi,Jun Zhao,Xiao Wang,Tao Ji,Hang Yan,Lixing Shen,Zhan Chen,Tao Gui,Qi Zhang,Xipeng Qiu,Xuanjing Huang,Zuxuan Wu,Yu-Gang Jiang

Reinforcement Learning from Human Feedback (RLHF) has become a crucial technology for aligning language models with human values and intentions, enabling models to produce more helpful and harmless responses. Reward models are trained as proxies for human preferences to drive reinforcement learning optimization. While reward models are often considered central to achieving high performance, they face the following challenges in practical applications: (1) Incorrect and ambiguous preference pairs in the dataset may hinder the reward model from accurately capturing human intent. (2) Reward models trained on data from a specific distribution often struggle to generalize to examples outside that distribution and are not suitable for iterative RLHF training. In this report, we attempt to address these two issues. (1) From a data perspective, we propose a method to measure the strength of preferences within the data, based on a voting mechanism of multiple reward models. Experimental results confirm that data with varying preference strengths have different impacts on reward model performance. We introduce a series of novel methods to mitigate the influence of incorrect and ambiguous preferences in the dataset and fully leverage high-quality preference data. (2) From an algorithmic standpoint, we introduce contrastive learning to enhance the ability of reward models to distinguish between chosen and rejected responses, thereby improving model generalization. Furthermore, we employ meta-learning to enable the reward model to maintain the ability to differentiate subtle differences in out-of-distribution samples, and this approach can be utilized for iterative RLHF optimization.

TransAct · 相互獨立的 · INTERACT · Guidance · Analysis ·

2024 年 1 月 11 日

The Concept of the Tactile Signature System for Individuals with Visual Impairments

Anatoliy Kremenchutskiy,Galymzhan Gabdreshov

from arxiv, The principles of connection between the interaction of a blind individual with the tactile field, when he uses touch, drawing various figures and forms, and the resulting prompts to the user to create a valid signature, have not been disclosed

The lack of an accessible and effective system for blind individuals to create handwritten signatures presents a significant barrier to their independence and full participation in various aspects of life. This research introduces the Tactile Signature System, a groundbreaking approach that empowers individuals with visual impairments to form their unique handwritten signatures. Key features of the system include: Personalized customization: Through tactile interaction and voice algorithmic guidance, individuals create signatures reflecting their preferences and natural writing style. Real-time feedback: AI-powered voice prompts and analysis ensure accuracy and consistency in signature formation. Accessibility: Installation in local service centers provides a secure and supervised environment for signature creation. The system's impact reaches beyond the individual level: Promotes inclusivity and independence: Blind individuals can engage in legal and financial transactions without relying on others. Empowers and fosters equal opportunities: Participation in education, employment, and civic engagement becomes more accessible. Aligns with international conventions: Upholds the right of persons with disabilities to participate fully in society. The Tactile Signature System represents a significant step towards an inclusive and accessible future for individuals with visual impairments.

Prompt · 語言模型化 · 大語言模型 · Analysis · MoDELS ·

2024 年 1 月 10 日

Prompting Large Language Models for Recommender Systems: A Comprehensive Framework and Empirical Analysis

Lanling Xu,Junjie Zhang,Bingqian Li,Jinpeng Wang,Mingchen Cai,Wayne Xin Zhao,Ji-Rong Wen

from arxiv, 40 pages, under review

Recently, large language models such as ChatGPT have showcased remarkable abilities in solving general tasks, demonstrating the potential for applications in recommender systems. To assess how effectively LLMs can be used in recommendation tasks, our study primarily focuses on employing LLMs as recommender systems through prompting engineering. We propose a general framework for utilizing LLMs in recommendation tasks, focusing on the capabilities of LLMs as recommenders. To conduct our analysis, we formalize the input of LLMs for recommendation into natural language prompts with two key aspects, and explain how our framework can be generalized to various recommendation scenarios. As for the use of LLMs as recommenders, we analyze the impact of public availability, tuning strategies, model architecture, parameter scale, and context length on recommendation results based on the classification of LLMs. As for prompt engineering, we further analyze the impact of four important components of prompts, \ie task descriptions, user interest modeling, candidate items construction and prompting strategies. In each section, we first define and categorize concepts in line with the existing literature. Then, we propose inspiring research questions followed by experiments to systematically analyze the impact of different factors on two public datasets. Finally, we summarize promising directions to shed lights on future research.

任務對話系統 · Analysis · Attention · 得分 · 論文 ·

2024 年 1 月 10 日

An Analysis of User Behaviours for Objectively Evaluating Spoken Dialogue Systems

Koji Inoue,Divesh Lala,Keiko Ochi,Tatsuya Kawahara,Gabriel Skantze

from arxiv, This paper has been accepted for presentation at International Workshop on Spoken Dialogue Systems Technology 2024 (IWSDS 2024) and represents the author's version of the work

Establishing evaluation schemes for spoken dialogue systems is important, but it can also be challenging. While subjective evaluations are commonly used in user experiments, objective evaluations are necessary for research comparison and reproducibility. To address this issue, we propose a framework for indirectly but objectively evaluating systems based on users' behaviours. In this paper, to this end, we investigate the relationship between user behaviours and subjective evaluation scores in social dialogue tasks: attentive listening, job interview, and first-meeting conversation. The results reveal that in dialogue tasks where user utterances are primary, such as attentive listening and job interview, indicators like the number of utterances and words play a significant role in evaluation. Observing disfluency also can indicate the effectiveness of formal tasks, such as job interview. On the other hand, in dialogue tasks with high interactivity, such as first-meeting conversation, behaviours related to turn-taking, like average switch pause length, become more important. These findings suggest that selecting appropriate user behaviours can provide valuable insights for objective evaluation in each social dialogue task.

劃分 · 類別 · 線性的 · 論文 · 數值分析 ·

2024 年 1 月 10 日

A New Class of Runge-Kutta Methods for Nonlinearly Partitioned Systems

Tommaso Buvoli,Ben S. Southworth

This work introduces a new class of Runge-Kutta methods for solving nonlinearly partitioned initial value problems. These new methods, named nonlinearly partitioned Runge-Kutta (NPRK), generalize existing additive and component-partitioned Runge-Kutta methods, and allow one to distribute different types of implicitness within nonlinear terms. The paper introduces the NPRK framework and discusses order conditions, linear stability, and the derivation of implicit-explicit and implicit-implicit NPRK integrators. The paper concludes with numerical experiments that demonstrate the utility of NPRK methods for solving viscous Burger's and the gray thermal radiation transport equations.

特征選擇 · 異常檢測 · PCA · Analysis · 通道 ·

2024 年 1 月 9 日

Empirical Analysis of Anomaly Detection on Hyperspectral Imaging Using Dimension Reduction Methods

Dongeon Kim,YeongHyeon Park

from arxiv, 4 pages, 4 figures, 3 tables

Recent studies try to use hyperspectral imaging (HSI) to detect foreign matters in products because it enables to visualize the invisible wavelengths including ultraviolet and infrared. Considering the enormous image channels of the HSI, several dimension reduction methods-e.g., PCA or UMAP-can be considered to reduce but those cannot ease the fundamental limitations, as follows: (1) latency of HSI capturing. (2) less explanation ability of the important channels. In this paper, to circumvent the aforementioned methods, one of the ways to channel reduction, on anomaly detection proposed HSI. Different from feature extraction methods (i.e., PCA or UMAP), feature selection can sort the feature by impact and show better explainability so we might redesign the task-optimized and cost-effective spectroscopic camera. Via the extensive experiment results with synthesized MVTec AD dataset, we confirm that the feature selection method shows 6.90x faster at the inference phase compared with feature extraction-based approaches while preserving anomaly detection performance. Ultimately, we conclude the advantage of feature selection which is effective yet fast.

道德化 · 極小點 · Agent · Continuity · MoDELS ·

2023 年 7 月 2 日

Minimum Levels of Interpretability for Artificial Moral Agents

Avish Vijayaraghavan,Cosmin Badea

As artificial intelligence (AI) models continue to scale up, they are becoming more capable and integrated into various forms of decision-making systems. For models involved in moral decision-making, also known as artificial moral agents (AMA), interpretability provides a way to trust and understand the agent's internal reasoning mechanisms for effective use and error correction. In this paper, we provide an overview of this rapidly-evolving sub-field of AI interpretability, introduce the concept of the Minimum Level of Interpretability (MLI) and recommend an MLI for various types of agents, to aid their safe deployment in real-world settings.

Machine Translation · NMT · Performer · state-of-the-art · 學成 ·

2018 年 6 月 1 日

A Survey of Domain Adaptation for Neural Machine Translation

Chenhui Chu,Rui Wang

from arxiv, COLING 2018, 16 pages, 9 figures

Neural machine translation (NMT) is a deep learning based approach for machine translation, which yields the state-of-the-art translation performance in scenarios where large-scale parallel corpora are available. Although the high-quality and domain-specific translation is crucial in the real world, domain-specific corpora are usually scarce or nonexistent, and thus vanilla NMT performs poorly in such scenarios. Domain adaptation that leverages both out-of-domain parallel corpora as well as monolingual corpora for in-domain translation, is very important for domain-specific translation. In this paper, we give a comprehensive survey of the state-of-the-art domain adaptation techniques for NMT.