99热日韩这里只有国产中文精品,亚洲国产A精品一区不卡,亚洲国产精品高清久久久1717,人人人妻人人澡人人爽欧洲一区

Transformer models have shown great success in natural language processing; however, their potential remains mostly unexplored for dynamical systems. In this work, we investigate the optimal output estimation problem using transformers, which generate output predictions using all the past ones. Particularly, we train the transformer using various distinct systems and then evaluate the performance on unseen systems with unknown dynamics. Empirically, the trained transformer adapts exceedingly well to different unseen systems and even matches the optimal performance given by the Kalman filter for linear systems. In more complex settings with non-i.i.d. noise, time-varying dynamics, and nonlinear dynamics like a quadrotor system with unknown parameters, transformers also demonstrate promising results. To support our experimental findings, we provide statistical guarantees that quantify the amount of training data required for the transformer to achieve a desired excess risk. Finally, we point out some limitations by identifying two classes of problems that lead to degraded performance, highlighting the need for caution when using transformers for control and estimation.

相關內容

變換

關注 2

Agent · 可理解性 · 大語言模型 · 語言模型化 · 值域 ·

2024 年 2 月 6 日

Can Generative Agents Predict Emotion?

Ciaran Regan,Nanami Iwahashi,Shogo Tanaka,Mizuki Oka

from arxiv, 14 pages, 6 figures

Large Language Models (LLMs) have demonstrated a number of human-like abilities, however the empathic understanding and emotional state of LLMs is yet to be aligned to that of humans. In this work, we investigate how the emotional state of generative LLM agents evolves as they perceive new events, introducing a novel architecture in which new experiences are compared to past memories. Through this comparison, the agent gains the ability to understand new experiences in context, which according to the appraisal theory of emotion is vital in emotion creation. First, the agent perceives new experiences as time series text data. After perceiving each new input, the agent generates a summary of past relevant memories, referred to as the norm, and compares the new experience to this norm. Through this comparison we can analyse how the agent reacts to the new experience in context. The PANAS, a test of affect, is administered to the agent, capturing the emotional state of the agent after the perception of the new event. Finally, the new experience is then added to the agents memory to be used in the creation of future norms. By creating multiple experiences in natural language from emotionally charged situations, we test the proposed architecture on a wide range of scenarios. The mixed results suggests that introducing context can occasionally improve the emotional alignment of the agent, but further study and comparison with human evaluators is necessary. We hope that this paper is another step towards the alignment of generative agents.

語言模型化 · 大語言模型 · CLUES · MoDELS · INFORMS ·

2024 年 2 月 6 日

Can Large Language Models Detect Rumors on Social Media?

Qiang Liu,Xiang Tao,Junfei Wu,Shu Wu,Liang Wang

In this work, we investigate to use Large Language Models (LLMs) for rumor detection on social media. However, it is challenging for LLMs to reason over the entire propagation information on social media, which contains news contents and numerous comments, due to LLMs may not concentrate on key clues in the complex propagation information, and have trouble in reasoning when facing massive and redundant information. Accordingly, we propose an LLM-empowered Rumor Detection (LeRuD) approach, in which we design prompts to teach LLMs to reason over important clues in news and comments, and divide the entire propagation information into a Chain-of-Propagation for reducing LLMs' burden. We conduct extensive experiments on the Twitter and Weibo datasets, and LeRuD outperforms several state-of-the-art rumor detection models by 2.4% to 7.6%. Meanwhile, by applying LLMs, LeRuD requires no data for training, and thus shows more promising rumor detection ability in few-shot or zero-shot scenarios.

模態 · 可約的 · 線性的 · 情景 · MoDELS ·

2024 年 2 月 6 日

When Are Prime Formulae Characteristic?

Luca Aceto,Dario Della Monica,Ignacio Fábregas,Anna Ingólfsdóttir

from arxiv, Updated conference version with the extended version

In the setting of the modal logic that characterizes modal refinement over modal transition systems, Boudol and Larsen showed that the formulae for which model checking can be reduced to preorder checking, that is, the characteristic formulae, are exactly the consistent and prime ones. This paper presents general, sufficient conditions guaranteeing that characteristic formulae are exactly the consistent and prime ones. It is shown that the given conditions apply to various behavioural relations in the literature. In particular, characteristic formulae are exactly the prime and consistent ones for all the semantics in van Glabbeek's linear time-branching time spectrum.

未標記 · 分離的 · 異常點 · Learning · Performer ·

2024 年 2 月 5 日

How Does Unlabeled Data Provably Help Out-of-Distribution Detection?

Xuefeng Du,Zhen Fang,Ilias Diakonikolas,Yixuan Li

from arxiv, ICLR 2024

Using unlabeled data to regularize the machine learning models has demonstrated promise for improving safety and reliability in detecting out-of-distribution (OOD) data. Harnessing the power of unlabeled in-the-wild data is non-trivial due to the heterogeneity of both in-distribution (ID) and OOD data. This lack of a clean set of OOD samples poses significant challenges in learning an optimal OOD classifier. Currently, there is a lack of research on formally understanding how unlabeled data helps OOD detection. This paper bridges the gap by introducing a new learning framework SAL (Separate And Learn) that offers both strong theoretical guarantees and empirical effectiveness. The framework separates candidate outliers from the unlabeled data and then trains an OOD classifier using the candidate outliers and the labeled ID data. Theoretically, we provide rigorous error bounds from the lens of separability and learnability, formally justifying the two components in our algorithm. Our theory shows that SAL can separate the candidate outliers with small error rates, which leads to a generalization guarantee for the learned OOD classifier. Empirically, SAL achieves state-of-the-art performance on common benchmarks, reinforcing our theoretical insights. Code is publicly available at //github.com/deeplearning-wisc/sal.

Automator · state-of-the-art · Performer · DATE · 近似 ·

2024 年 2 月 5 日

Are Sounds Sound for Phylogenetic Reconstruction?

Luise H?user,Gerhard J?ger,Taraka Rama,Johann-Mattis List,Alexandros Stamatakis

from arxiv, Paper accepted for SIGTYP (2024): H\"auser, Luise; J\"ager, Gerhard; List, Johann-Mattis; Rama, Taraka; and Stamatakis, Alexandros (2024): Are sounds sound for phylogenetic reconstruction? In: Proceedings of the 6th Workshop on Research in Computational Linguistic Typology and Multilingual NLP (SIGTYP 2024)

In traditional studies on language evolution, scholars often emphasize the importance of sound laws and sound correspondences for phylogenetic inference of language family trees. However, to date, computational approaches have typically not taken this potential into account. Most computational studies still rely on lexical cognates as major data source for phylogenetic reconstruction in linguistics, although there do exist a few studies in which authors praise the benefits of comparing words at the level of sound sequences. Building on (a) ten diverse datasets from different language families, and (b) state-of-the-art methods for automated cognate and sound correspondence detection, we test, for the first time, the performance of sound-based versus cognate-based approaches to phylogenetic reconstruction. Our results show that phylogenies reconstructed from lexical cognates are topologically closer, by approximately one third with respect to the generalized quartet distance on average, to the gold standard phylogenies than phylogenies reconstructed from sound correspondences.

語言模型化 · 大語言模型 · 相互獨立的 · MoDELS · Learning ·

2024 年 2 月 4 日

Can Large Language Models Learn Independent Causal Mechanisms?

Ga?l Gendron,Bao Trung Nguyen,Alex Yuxuan Peng,Michael Witbrock,Gillian Dobbie

from arxiv, 17 pages, 8 pages for the main paper and 9 pages for references and appendices, 12 figures

Despite impressive performance on language modelling and complex reasoning tasks, Large Language Models (LLMs) fall short on the same tasks in uncommon settings or with distribution shifts, exhibiting some lack of generalisation ability. This issue has usually been alleviated by feeding more training data into the LLM. However, this method is brittle, as the scope of tasks may not be readily predictable or may evolve, and updating the model with new data generally requires extensive additional training. By contrast, systems, such as causal models, that learn abstract variables and causal relationships can demonstrate increased robustness against changes in the distribution. One reason for this success is the existence and use of Independent Causal Mechanisms (ICMs) representing high-level concepts that only sparsely interact. In this work, we apply two concepts from causality to learn ICMs within LLMs. We develop a new LLM architecture composed of multiple sparsely interacting language modelling modules. We introduce a routing scheme to induce specialisation of the network into domain-specific modules. We also present a Mutual Information minimisation objective that trains a separate module to learn abstraction and domain-invariant mechanisms. We show that such causal constraints can improve out-of-distribution performance on abstract and causal reasoning tasks.

推斷 · TOOLS · Machine Learning · Learning · 相互獨立的 ·

2024 年 2 月 2 日

Do We Really Even Need Data?

Kentaro Hoffman,Stephen Salerno,Awan Afiaz,Jeffrey T. Leek,Tyler H. McCormick

As artificial intelligence and machine learning tools become more accessible, and scientists face new obstacles to data collection (e.g. rising costs, declining survey response rates), researchers increasingly use predictions from pre-trained algorithms as outcome variables. Though appealing for financial and logistical reasons, using standard tools for inference can misrepresent the association between independent variables and the outcome of interest when the true, unobserved outcome is replaced by a predicted value. In this paper, we characterize the statistical challenges inherent to this so-called ``inference with predicted data'' problem and elucidate three potential sources of error: (i) the relationship between predicted outcomes and their true, unobserved counterparts, (ii) robustness of the machine learning model to resampling or uncertainty about the training data, and (iii) appropriately propagating not just bias but also uncertainty from predictions into the ultimate inference procedure.

Performer · MoDELS · 代碼 · Boosting（一種模型訓練加速方式） · GPT-4 ·

2024 年 2 月 2 日

Is Self-Repair a Silver Bullet for Code Generation?

Theo X. Olausson,Jeevana Priya Inala,Chenglong Wang,Jianfeng Gao,Armando Solar-Lezama

from arxiv, Accepted to ICLR 2024. Added additional Code Llama experiments and fixed a data processing error harming Code Llama's reported self-repair performance on HumanEval

Large language models have shown remarkable aptitude in code generation, but still struggle to perform complex tasks. Self-repair -- in which the model debugs and repairs its own code -- has recently become a popular way to boost performance in these settings. However, despite its increasing popularity, existing studies of self-repair have been limited in scope; in many settings, its efficacy thus remains poorly understood. In this paper, we analyze Code Llama, GPT-3.5 and GPT-4's ability to perform self-repair on problems taken from HumanEval and APPS. We find that when the cost of carrying out repair is taken into account, performance gains are often modest, vary a lot between subsets of the data, and are sometimes not present at all. We hypothesize that this is because self-repair is bottlenecked by the model's ability to provide feedback on its own code; using a stronger model to artificially boost the quality of the feedback, we observe substantially larger performance gains. Similarly, a small-scale study in which we provide GPT-4 with feedback from human participants suggests that even for the strongest models, self-repair still lags far behind what can be achieved with human-level debugging.

變換 · MoDELS · 王灝 · 語言模型化 · 詞向量表示 ·

2024 年 2 月 2 日

How Powerful are Decoder-Only Transformer Neural Models?

Jesse Roberts

In this article we prove that the general transformer neural model undergirding modern large language models (LLMs) is Turing complete under reasonable assumptions. This is the first work to directly address the Turing completeness of the underlying technology employed in GPT-x as past work has focused on the more expressive, full auto-encoder transformer architecture. From this theoretical analysis, we show that the sparsity/compressibility of the word embedding is an important consideration for Turing completeness to hold. We also show that Transformers are are a variant of B machines studied by Hao Wang.

MoDELS · 語言模型化 · SimPLe · 置換 · 大語言模型 ·

2024 年 2 月 1 日

Can Large Language Models Replace Economic Choice Prediction Labs?

Eilam Shapira,Omer Madmon,Roi Reichart,Moshe Tennenholtz

Economic choice prediction is an essential challenging task, often constrained by the difficulties in acquiring human choice data. Indeed, experimental economics studies had focused mostly on simple choice settings. The AI community has recently contributed to that effort in two ways: considering whether LLMs can substitute for humans in the above-mentioned simple choice prediction settings, and the study through ML lens of more elaborated but still rigorous experimental economics settings, employing incomplete information, repetitive play, and natural language communication, notably language-based persuasion games. This leaves us with a major inspiration: can LLMs be used to fully simulate the economic environment and generate data for efficient human choice prediction, substituting for the elaborated economic lab studies? We pioneer the study of this subject, demonstrating its feasibility. In particular, we show that a model trained solely on LLM-generated data can effectively predict human behavior in a language-based persuasion game, and can even outperform models trained on actual human data.