三级电影一区二区三区_亚洲综合在线观看一区二区三区_亚洲欧美变态国产另类_在线免费播放的黄色视频_亚洲国产精品成人一区二区久久_欧美激性欧美激情A_国产真实视频网址在线观看

A wide range of real-world applications is characterized by their symbolic nature, necessitating a strong capability for symbolic reasoning. This paper investigates the potential application of Large Language Models (LLMs) as symbolic reasoners. We focus on text-based games, significant benchmarks for agents with natural language capabilities, particularly in symbolic tasks like math, map reading, sorting, and applying common sense in text-based worlds. To facilitate these agents, we propose an LLM agent designed to tackle symbolic challenges and achieve in-game objectives. We begin by initializing the LLM agent and informing it of its role. The agent then receives observations and a set of valid actions from the text-based games, along with a specific symbolic module. With these inputs, the LLM agent chooses an action and interacts with the game environments. Our experimental results demonstrate that our method significantly enhances the capability of LLMs as automated agents for symbolic reasoning, and our LLM agent is effective in text-based games involving symbolic tasks, achieving an average performance of 88% across all tasks.

相關內容

大語(yu)言模型

關注 56

大(da)語(yu)(yu)言(yan)模型(xing)(xing)是基于海量文本(ben)(ben)數(shu)據(ju)訓(xun)練的(de)深度學習(xi)模型(xing)(xing)。它(ta)不(bu)僅(jin)能(neng)夠(gou)生成自然(ran)語(yu)(yu)言(yan)文本(ben)(ben)，還能(neng)夠(gou)深入理(li)解(jie)文本(ben)(ben)含義，處(chu)理(li)各種自然(ran)語(yu)(yu)言(yan)任(ren)務，如(ru)文本(ben)(ben)摘要、問答、翻譯等。2023年，大(da)語(yu)(yu)言(yan)模型(xing)(xing)及(ji)其在(zai)人工智(zhi)能(neng)領域的(de)應用已成為全球科技(ji)研究的(de)熱(re)點，其在(zai)規模上的(de)增長(chang)尤(you)為引人注目，參數(shu)量已從最初(chu)的(de)十幾億(yi)躍升(sheng)到如(ru)今的(de)一(yi)(yi)萬億(yi)。參數(shu)量的(de)提升(sheng)使得模型(xing)(xing)能(neng)夠(gou)更(geng)(geng)(geng)加(jia)精(jing)細地(di)捕捉(zhuo)人類語(yu)(yu)言(yan)微妙之處(chu)，更(geng)(geng)(geng)加(jia)深入地(di)理(li)解(jie)人類語(yu)(yu)言(yan)的(de)復(fu)(fu)雜性。在(zai)過去的(de)一(yi)(yi)年里，大(da)語(yu)(yu)言(yan)模型(xing)(xing)在(zai)吸納新(xin)知識、分解(jie)復(fu)(fu)雜任(ren)務以及(ji)圖文對齊等多(duo)方面都有(you)顯著提升(sheng)。隨著技(ji)術(shu)的(de)不(bu)斷成熟，它(ta)將不(bu)斷拓展(zhan)其應用范圍，為人類提供(gong)更(geng)(geng)(geng)加(jia)智(zhi)能(neng)化(hua)和(he)個性化(hua)的(de)服(fu)務，進一(yi)(yi)步改善人們的(de)生活和(he)生產方式。

大語言模型 · 語言模型化 · MoDELS · Performer · 黑盒 ·

2024 年 2 月 28 日

Large Language Models As Evolution Strategies

Robert Tjarko Lange,Yingtao Tian,Yujin Tang

from arxiv, 11 pages, 14 figures

Large Transformer models are capable of implementing a plethora of so-called in-context learning algorithms. These include gradient descent, classification, sequence completion, transformation, and improvement. In this work, we investigate whether large language models (LLMs), which never explicitly encountered the task of black-box optimization, are in principle capable of implementing evolutionary optimization algorithms. While previous works have solely focused on language-based task specification, we move forward and focus on the zero-shot application of LLMs to black-box optimization. We introduce a novel prompting strategy, consisting of least-to-most sorting of discretized population members and querying the LLM to propose an improvement to the mean statistic, i.e. perform a type of black-box recombination operation. Empirically, we find that our setup allows the user to obtain an LLM-based evolution strategy, which we call `EvoLLM', that robustly outperforms baseline algorithms such as random search and Gaussian Hill Climbing on synthetic BBOB functions as well as small neuroevolution tasks. Hence, LLMs can act as `plug-in' in-context recombination operators. We provide several comparative studies of the LLM's model size, prompt strategy, and context construction. Finally, we show that one can flexibly improve EvoLLM's performance by providing teacher algorithm information via instruction fine-tuning on previously collected teacher optimization trajectories.

MoDELS · 可理解性 · 語言模型化 · Learning · 代碼 ·

2024 年 2 月 27 日

Towards Understanding What Code Language Models Learned

Toufique Ahmed,Dian Yu,Chengxuan Huang,Cathy Wang,Prem Devanbu,Kenji Sagae

Pre-trained language models are effective in a variety of natural language tasks, but it has been argued their capabilities fall short of fully learning meaning or understanding language. To understand the extent to which language models can learn some form of meaning, we investigate their ability to capture semantics of code beyond superficial frequency and co-occurrence. In contrast to previous research on probing models for linguistic features, we study pre-trained models in a setting that allows for objective and straightforward evaluation of a model's ability to learn semantics. In this paper, we examine whether such models capture the semantics of code, which is precisely and formally defined. Through experiments involving the manipulation of code fragments, we show that code pre-trained models of code learn a robust representation of the computational semantics of code that goes beyond superficial features of form alone

因子分析 · 分解的 · Analysis · 稀疏 · 觀測變量 ·

2024 年 2 月 27 日

Algebraic Sparse Factor Analysis

Mathias Drton,Alexandros Grosdos,Irem Portakal,Nils Sturma

from arxiv, 24 pages

Factor analysis is a statistical technique that explains correlations among observed random variables with the help of a smaller number of unobserved factors. In traditional full factor analysis, each observed variable is influenced by every factor. However, many applications exhibit interesting sparsity patterns, that is, each observed variable only depends on a subset of the factors. In this paper, we study such sparse factor analysis models from an algebro-geometric perspective. Under mild conditions on the sparsity pattern, we examine the dimension of the set of covariance matrices that corresponds to a given model. Moreover, we study algebraic relations among the covariances in sparse two-factor models. In particular, we identify cases in which a Gr\"obner basis for these relations can be derived via a 2-delightful term order and joins of toric edge ideals.

優化器 · MoDELS · 情景 · Performer · INFORMS ·

2024 年 2 月 27 日

Inverse Transfer Multiobjective Optimization

Jiao Liu,Abhishek Gupta,Yew-Soon Ong

Transfer optimization enables data-efficient optimization of a target task by leveraging experiential priors from related source tasks. This is especially useful in multiobjective optimization settings where a set of trade-off solutions is sought under tight evaluation budgets. In this paper, we introduce a novel concept of inverse transfer in multiobjective optimization. Inverse transfer stands out by employing probabilistic inverse models to map performance vectors in the objective space to population search distributions in task-specific decision space, facilitating knowledge transfer through objective space unification. Building upon this idea, we introduce the first Inverse Transfer Multiobjective Evolutionary Optimizer (invTrEMO). A key highlight of invTrEMO is its ability to harness the common objective functions prevalent in many application areas, even when decision spaces do not precisely align between tasks. This allows invTrEMO to uniquely and effectively utilize information from heterogeneous source tasks as well. Furthermore, invTrEMO yields high-precision inverse models as a significant byproduct, enabling the generation of tailored solutions on-demand based on user preferences. Empirical studies on multi- and many-objective benchmark problems, as well as a practical case study, showcase the faster convergence rate and modelling accuracy of the invTrEMO relative to state-of-the-art evolutionary and Bayesian optimization algorithms. The source code of the invTrEMO is made available at //github.com/LiuJ-2023/invTrEMO.

解碼 · Processing（編程語言） · Learning · AIM · 可理解性 ·

2024 年 2 月 26 日

Towards Decoding Brain Activity During Passive Listening of Speech

Milán András Fodor,Tamás Gábor Csapó,Frigyes Viktor Arthur

from arxiv, 27 pages, 7 figures

The aim of the study is to investigate the complex mechanisms of speech perception and ultimately decode the electrical changes in the brain accruing while listening to speech. We attempt to decode heard speech from intracranial electroencephalographic (iEEG) data using deep learning methods. The goal is to aid the advancement of brain-computer interface (BCI) technology for speech synthesis, and, hopefully, to provide an additional perspective on the cognitive processes of speech perception. This approach diverges from the conventional focus on speech production and instead chooses to investigate neural representations of perceived speech. This angle opened up a complex perspective, potentially allowing us to study more sophisticated neural patterns. Leveraging the power of deep learning models, the research aimed to establish a connection between these intricate neural activities and the corresponding speech sounds. Despite the approach not having achieved a breakthrough yet, the research sheds light on the potential of decoding neural activity during speech perception. Our current efforts can serve as a foundation, and we are optimistic about the potential of expanding and improving upon this work to move closer towards more advanced BCIs, better understanding of processes underlying perceived speech and its relation to spoken speech.

Analysis · Networking · Group Lasso · 近似誤差 · 近似 ·

2024 年 2 月 26 日

Penalized Generative Variable Selection

Tong Wang,Jian Huang,Shuangge Ma

Deep networks are increasingly applied to a wide variety of data, including data with high-dimensional predictors. In such analysis, variable selection can be needed along with estimation/model building. Many of the existing deep network studies that incorporate variable selection have been limited to methodological and numerical developments. In this study, we consider modeling/estimation using the conditional Wasserstein Generative Adversarial networks. Group Lasso penalization is applied for variable selection, which may improve model estimation/prediction, interpretability, stability, etc. Significantly advancing from the existing literature, the analysis of censored survival data is also considered. We establish the convergence rate for variable selection while considering the approximation error, and obtain a more efficient distribution estimation. Simulations and the analysis of real experimental data demonstrate satisfactory practical utility of the proposed analysis.

MoDELS · 學成 · Networking · 動力系統 · Neural Networks ·

2022 年 2 月 4 日

On Neural Differential Equations

Patrick Kidger

from arxiv, Doctoral thesis, Mathematical Institute, University of Oxford. 231 pages

The conjoining of dynamical systems and deep learning has become a topic of great interest. In particular, neural differential equations (NDEs) demonstrate that neural networks and differential equation are two sides of the same coin. Traditional parameterised differential equations are a special case. Many popular neural network architectures, such as residual networks and recurrent networks, are discretisations. NDEs are suitable for tackling generative problems, dynamical systems, and time series (particularly in physics, finance, ...) and are thus of interest to both modern machine learning and traditional mathematical modelling. NDEs offer high-capacity function approximation, strong priors on model space, the ability to handle irregular data, memory efficiency, and a wealth of available theory on both sides. This doctoral thesis provides an in-depth survey of the field. Topics include: neural ordinary differential equations (e.g. for hybrid neural/mechanistic modelling of physical systems); neural controlled differential equations (e.g. for learning functions of irregular time series); and neural stochastic differential equations (e.g. to produce generative models capable of representing complex stochastic dynamics, or sampling from complex high-dimensional distributions). Further topics include: numerical methods for NDEs (e.g. reversible differential equations solvers, backpropagation through differential equations, Brownian reconstruction); symbolic regression for dynamical systems (e.g. via regularised evolution); and deep implicit models (e.g. deep equilibrium models, differentiable optimisation). We anticipate this thesis will be of interest to anyone interested in the marriage of deep learning with dynamical systems, and hope it will provide a useful reference for the current state of the art.

entity · 圖 · 知識圖譜 · MoDELS · 相似度 ·

2019 年 9 月 11 日

Domain Representation for Knowledge Graph Embedding

Cunxiang Wang,Feiliang Ren,Zhichao Lin,Chenxv Zhao,Tian Xie,Yue Zhang

from arxiv, Acceptted by NLPCC2019

Embedding entities and relations into a continuous multi-dimensional vector space have become the dominant method for knowledge graph embedding in representation learning. However, most existing models ignore to represent hierarchical knowledge, such as the similarities and dissimilarities of entities in one domain. We proposed to learn a Domain Representations over existing knowledge graph embedding models, such that entities that have similar attributes are organized into the same domain. Such hierarchical knowledge of domains can give further evidence in link prediction. Experimental results show that domain embeddings give a significant improvement over the most recent state-of-art baseline knowledge graph embedding models.

Performer · 深度強化學習 · 學成 · entity · 強化學習 ·

2018 年 6 月 28 日

Relational Deep Reinforcement Learning

Vinicius Zambaldi,David Raposo,Adam Santoro,Victor Bapst,Yujia Li,Igor Babuschkin,Karl Tuyls,David Reichert,Timothy Lillicrap,Edward Lockhart,Murray Shanahan,Victoria Langston,Razvan Pascanu,Matthew Botvinick,Oriol Vinyals,Peter Battaglia

We introduce an approach for deep reinforcement learning (RL) that improves upon the efficiency, generalization capacity, and interpretability of conventional approaches through structured perception and relational reasoning. It uses self-attention to iteratively reason about the relations between entities in a scene and to guide a model-free policy. Our results show that in a novel navigation and planning task called Box-World, our agent finds interpretable solutions that improve upon baselines in terms of sample complexity, ability to generalize to more complex scenes than experienced during training, and overall performance. In the StarCraft II Learning Environment, our agent achieves state-of-the-art performance on six mini-games -- surpassing human grandmaster performance on four. By considering architectural inductive biases, our work opens new directions for overcoming important, but stubborn, challenges in deep RL.

BLEU · MoDELS · 注意力機制 · Transformer · Networking ·

2017 年 12 月 6 日

Attention Is All You Need

Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin

from arxiv, 15 pages, 5 figures

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.