韩国成年性午夜免费视频-69WW无码免费视频播放

Many NLP researchers are experiencing an existential crisis triggered by the astonishing success of ChatGPT and other systems based on large language models (LLMs). After such a disruptive change to our understanding of the field, what is left to do? Taking a historical lens, we look for guidance from the first era of LLMs, which began in 2005 with large $n$-gram models for machine translation. We identify durable lessons from the first era, and more importantly, we identify evergreen problems where NLP researchers can continue to make meaningful contributions in areas where LLMs are ascendant. Among these lessons, we discuss the primacy of hardware advancement in shaping the availability and importance of scale, as well as the urgent challenge of quality evaluation, both automated and human. We argue that disparities in scale are transient and that researchers can work to reduce them; that data, rather than hardware, is still a bottleneck for many meaningful applications; that meaningful evaluation informed by actual use is still an open problem; and that there is still room for speculative approaches.

相關內容

語言模型化

關注 9

CoT · 模型評估 · 數學 · 優化器 · Learning ·

2023 年 12 月 29 日

Olapa-MCoT: Enhancing the Chinese Mathematical Reasoning Capability of LLMs

Shaojie Zhu,Zhaobin Wang,Chengxiang Zhuo,Hui Lu,Bo Hu,Zang Li

from arxiv, 10 pages, 1 figures

CoT (Chain-of-Thought) is a way to solve reasoning problems for LLMs . Recently, many researches appear for improving the CoT capability of LLMs. In this work, we also proposed Olapa-MCoT, which is a LLMs based on llama2-13B PLM for finetuning and alignment learning. During the alignment training, we proposed the SimRRHF algorithm and Incorrect Data Relearning and mainly focused on optimizing the Chinese mathematical reasoning ability of Olapa-MCoT. The experiment achieved significant results, with the accuracy of Chinese mathematical reasoning up to 50%, 36% rise compared to llama2-13B. In addition, the accuracy of English reasoning ability also increased by nearly 4%.

Ad hoc · Agent · 大語言模型 · on the fly · motivation ·

2023 年 12 月 29 日

Cooperation on the Fly: Exploring Language Agents for Ad Hoc Teamwork in the Avalon Game

Zijing Shi,Meng Fang,Shunfeng Zheng,Shilong Deng,Ling Chen,Yali Du

from arxiv, Code will release soon

Multi-agent collaboration with Large Language Models (LLMs) demonstrates proficiency in basic tasks, yet its efficiency in more complex scenarios remains unexplored. In gaming environments, these agents often face situations without established coordination protocols, requiring them to make intelligent inferences about teammates from limited data. This problem motivates the area of ad hoc teamwork, in which an agent may potentially cooperate with a variety of teammates to achieve a shared goal. Our study focuses on the ad hoc teamwork problem where the agent operates in an environment driven by natural language. Our findings reveal the potential of LLM agents in team collaboration, highlighting issues related to hallucinations in communication. To address this issue, we develop CodeAct, a general agent that equips LLM with enhanced memory and code-driven reasoning, enabling the repurposing of partial information for rapid adaptation to new teammates.

位置編碼 · 變換 · Transformer · MoDELS · AIM ·

2023 年 12 月 29 日

Length Extrapolation of Transformers: A Survey from the Perspective of Position Encoding

Liang Zhao,Xiaocheng Feng,Xiachong Feng,Bing Qin,Ting Liu

from arxiv, Work in progress

Transformer has taken the natural language processing (NLP) field by storm since birth, owing to its superior ability to model complex dependencies in sequences. Despite the great success of pretrained language models (PLMs) based on Transformer across almost all NLP tasks, they all suffer from a preset length limit and thus can hardly extend this success to longer sequences beyond seen data, namely the length extrapolation problem. Length extrapolation has aroused great interest among researchers, as it is the core feature of human language capacity. To enhance length extrapolation of Transformers, a plethora of methods have been proposed, mostly focusing on extrapolatable position encodings. In this article, we provide an organized and systematical review of these research efforts in a unified notation from a position encoding perspective, aiming to enable the reader to gain a deep understanding of existing methods and provide stimuli for future research.

Processing（編程語言） · 有偏 · Principle · 統計量 · Less ·

2023 年 12 月 28 日

The Gatekeeper Effect: The Implications of Pre-Screening, Self-selection, and Bias for Hiring Processes

Moran Koren

We study the problem of screening in decision-making processes under uncertainty, focusing on the impact of adding an additional screening stage, commonly known as a 'gatekeeper.' While our primary analysis is rooted in the context of job market hiring, the principles and findings are broadly applicable to areas such as educational admissions, healthcare patient selection, and financial loan approvals. The gatekeeper's role is to assess applicants' suitability before significant investments are made. Our study reveals that while gatekeepers are designed to streamline the selection process by filtering out less likely candidates, they can sometimes inadvertently affect the candidates' own decision-making process. We explore the conditions under which the introduction of a gatekeeper can enhance or impede the efficiency of these processes. Additionally, we consider how adjusting gatekeeping strategies might impact the accuracy of selection decisions. Our research also extends to scenarios where gatekeeping is influenced by historical biases, particularly in competitive settings like hiring. We discover that candidates confronted with a statistically biased gatekeeping process are more likely to withdraw from applying, thereby perpetuating the previously mentioned historical biases. The study suggests that measures such as affirmative action can be effective in addressing these biases. While centered on hiring, the insights and methodologies from our study have significant implications for a wide range of fields where screening and gatekeeping are integral.

Continuity · Lipschitz · 全局優化 · Performer · Extensibility ·

2023 年 12 月 28 日

Cumulative Regret Analysis of the Piyavskii--Shubert Algorithm and Its Variants for Global Optimization

Kaan Gokcesu,Hakan Gokcesu

We study the problem of global optimization, where we analyze the performance of the Piyavskii--Shubert algorithm and its variants. For any given time duration $T$, instead of the extensively studied simple regret (which is the difference of the losses between the best estimate up to $T$ and the global minimum), we study the cumulative regret up to time $T$. For $L$-Lipschitz continuous functions, we show that the cumulative regret is $O(L\log T)$. For $H$-Lipschitz smooth functions, we show that the cumulative regret is $O(H)$. We analytically extend our results for functions with Holder continuous derivatives, which cover both the Lipschitz continuous and the Lipschitz smooth functions, individually. We further show that a simpler variant of the Piyavskii-Shubert algorithm performs just as well as the traditional variants for the Lipschitz continuous or the Lipschitz smooth functions. We further extend our results to broader classes of functions, and show that, our algorithm efficiently determines its queries; and achieves nearly minimax optimal (up to log factors) cumulative regret, for general convex or even concave regularity conditions on the extrema of the objective (which encompasses many preceding regularities). We consider further extensions by investigating the performance of the Piyavskii-Shubert variants in the scenarios with unknown regularity, noisy evaluation and multivariate domain.

Learning · Agent · 全 · 優化器 · INFORMS ·

2023 年 12 月 28 日

Achieving Maximum Utilization in Optimal Time for Learning or Convergence in the Kolkata Paise Restaurant Problem

Aniruddha Biswas,Antika Sinha,Bikas K. Chakrabarti

from arxiv, 9 pages, 6 figures included in manuscript; submitted to Indian Journal of Physics

The objective of the KPR agents are to learn themselves in the minimum (learning) time to have maximum success or utilization probability ($f$). A dictator can easily solve the problem with $f = 1$ in no time, by asking every one to form a queue and go to the respective restaurant, resulting in no fluctuation and full utilization from the first day (convergence time $\tau = 0$). It has already been shown that if each agent chooses randomly the restaurants, $f = 1 - e^{-1} \simeq 0.63$ (where $e \simeq 2.718$ denotes the Euler number) in zero time ($\tau = 0$). With the only available information about yesterday's crowd size in the restaurant visited by the agent (as assumed for the rest of the strategies studied here), the crowd avoiding (CA) strategies can give higher values of $f$ but also of $\tau$. Several numerical studies of modified learning strategies actually indicated increased value of $f = 1 - \alpha$ for $\alpha \to 0$, with $\tau \sim 1/\alpha$. We show here using Monte Carlo technique, a modified Greedy Crowd Avoiding (GCA) Strategy can assure full utilization ($f = 1$) in convergence time $\tau \simeq eN$, with of course non-zero probability for an even larger convergence time. All these observations suggest that the strategies with single step memory of the individuals can never collectively achieve full utilization ($f = 1$) in finite convergence time and perhaps the maximum possible utilization that can be achieved is about eighty percent ($f \simeq 0.80$) in an optimal time $\tau$ of order ten, even when $N$ the number of customers or of the restaurants goes to infinity.

語言模型化 · 大語言模型 · MoDELS · 縮放 · 可約的 ·

2023 年 12 月 28 日

Revisiting the Reliability of Psychological Scales on Large Language Models

Jen-tse Huang,Wenxuan Wang,Man Ho Lam,Eric John Li,Wenxiang Jiao,Michael R. Lyu

from arxiv, 10 pages. Added more comprehensive experiments and analysis

Recent research has extended beyond assessing the performance of Large Language Models (LLMs) to examining their characteristics from a psychological standpoint, acknowledging the necessity of understanding their behavioral characteristics. The administration of personality tests to LLMs has emerged as a noteworthy area in this context. However, the suitability of employing psychological scales, initially devised for humans, on LLMs is a matter of ongoing debate. Our study aims to determine the reliability of applying personality assessments to LLMs, explicitly investigating whether LLMs demonstrate consistent personality traits. Analyzing responses under 2,500 settings reveals that gpt-3.5-turbo shows consistency in responses to the Big Five Inventory, indicating a high degree of reliability. Furthermore, our research explores the potential of gpt-3.5-turbo to emulate diverse personalities and represent various groups, which is a capability increasingly sought after in social sciences for substituting human participants with LLMs to reduce costs. Our findings reveal that LLMs have the potential to represent different personalities with specific prompt instructions. By shedding light on the personalization of LLMs, our study endeavors to pave the way for future explorations in this field. We have made our experimental results and the corresponding code openly accessible via //github.com/CUHK-ARISE/LLMPersonality.

MoDELS · Networking · 參數空間 · Performer · 奇異的 ·

2023 年 12 月 28 日

A Geometric Modeling of Occam's Razor in Deep Learning

Ke Sun,Frank Nielsen

from arxiv, This work first appeared under the former title "Lightlike Neuromanifolds, Occam's Razor and Deep Learning"

Why do deep neural networks (DNNs) benefit from very high dimensional parameter spaces? Their huge parameter complexities vs. stunning performances in practice is all the more intriguing and not explainable using the standard theory of regular models. In this work, we propose a geometrically flavored information-theoretic approach to study this phenomenon. Namely, we introduce the locally varying dimensionality of the parameter space of neural network models by considering the number of significant dimensions of the Fisher information matrix, and model the parameter space as a manifold using the framework of singular semi-Riemannian geometry. We derive model complexity measures which yield short description lengths for deep neural network models based on their singularity analysis thus explaining the good performance of DNNs despite their large number of parameters.

Agent · AI Agent · 語言模型化 · AI · MoDELS ·

2023 年 9 月 14 日

The Rise and Potential of Large Language Model Based Agents: A Survey

Zhiheng Xi,Wenxiang Chen,Xin Guo,Wei He,Yiwen Ding,Boyang Hong,Ming Zhang,Junzhe Wang,Senjie Jin,Enyu Zhou,Rui Zheng,Xiaoran Fan,Xiao Wang,Limao Xiong,Qin Liu,Yuhao Zhou,Weiran Wang,Changhao Jiang,Yicheng Zou,Xiangyang Liu,Zhangyue Yin,Shihan Dou,Rongxiang Weng,Wensen Cheng,Qi Zhang,Wenjuan Qin,Yongyan Zheng,Xipeng Qiu,Xuanjing Huan,Tao Gui

from arxiv, 86 pages, 12 figures

For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are artificial entities that sense their environment, make decisions, and take actions. Many efforts have been made to develop intelligent AI agents since the mid-20th century. However, these efforts have mainly focused on advancement in algorithms or training strategies to enhance specific capabilities or performance on particular tasks. Actually, what the community lacks is a sufficiently general and powerful model to serve as a starting point for designing AI agents that can adapt to diverse scenarios. Due to the versatile and remarkable capabilities they demonstrate, large language models (LLMs) are regarded as potential sparks for Artificial General Intelligence (AGI), offering hope for building general AI agents. Many research efforts have leveraged LLMs as the foundation to build AI agents and have achieved significant progress. We start by tracing the concept of agents from its philosophical origins to its development in AI, and explain why LLMs are suitable foundations for AI agents. Building upon this, we present a conceptual framework for LLM-based agents, comprising three main components: brain, perception, and action, and the framework can be tailored to suit different applications. Subsequently, we explore the extensive applications of LLM-based agents in three aspects: single-agent scenarios, multi-agent scenarios, and human-agent cooperation. Following this, we delve into agent societies, exploring the behavior and personality of LLM-based agents, the social phenomena that emerge when they form societies, and the insights they offer for human society. Finally, we discuss a range of key topics and open problems within the field.

知識 (knowledge) · Processing（編程語言） · 圖 · NLP · 知識圖譜 ·

2022 年 9 月 30 日

A Decade of Knowledge Graphs in Natural Language Processing: A Survey

Phillip Schneider,Tim Schopf,Juraj Vladika,Mikhail Galkin,Elena Simperl,Florian Matthes

from arxiv, Accepted to AACL-IJCNLP 2022

In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years. Given the increasing amount of research work in this area, several KG-related approaches have been surveyed in the NLP research community. However, a comprehensive study that categorizes established topics and reviews the maturity of individual research streams remains absent to this day. Contributing to closing this gap, we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work.