精品亚洲中文一区二区三区,国产欧美日韩精品A在线播放,一区二区三区无码免费高清视频

To what extent can language alone give rise to complex concepts, or is embodied experience essential? Recent advancements in large language models (LLMs) offer fresh perspectives on this question. Although LLMs are trained on restricted modalities, they exhibit human-like performance in diverse psychological tasks. Our study compared representations of 4,442 lexical concepts between humans and ChatGPTs (GPT-3.5 and GPT-4) across multiple dimensions, including five key domains: emotion, salience, mental visualization, sensory, and motor experience. We identify two main findings: 1) Both models strongly align with human representations in non-sensorimotor domains but lag in sensory and motor areas, with GPT-4 outperforming GPT-3.5; 2) GPT-4's gains are associated with its additional visual learning, which also appears to benefit related dimensions like haptics and imageability. These results highlight the limitations of language in isolation, and that the integration of diverse modalities of inputs leads to a more human-like conceptual representation.

相關內容

語言模型化

關注 9

語言模型化 · MoDELS · SimPLe · 詞元分析器 · 解碼 ·

2024 年 1 月 19 日

A Simple Framework to Accelerate Multilingual Language Model for Monolingual Text Generation

Jimin Hong,Gibbeum Lee,Jaewoong Cho

Recent advancements in large language models have facilitated the execution of complex language tasks, not only in English but also in non-English languages. However, the tokenizers of most language models, such as Llama, trained on English-centric corpora, tend to excessively fragment tokens in non-English languages. This issue is especially pronounced in non-roman alphabetic languages, which are often divided at a character or even Unicode level, leading to slower text generation. To address this, our study introduces a novel framework designed to expedite text generation in these languages. This framework predicts larger linguistic units than those of conventional multilingual tokenizers and is specifically tailored to the target language, thereby reducing the number of decoding steps required. Our empirical results demonstrate that the proposed framework increases the generation speed by a factor of 1.9 compared to standard decoding while maintaining the performance of a pre-trained multilingual model on monolingual tasks.

圖像修復 · 語言模型化 · 大語言模型 · 多峰值 · MoDELS ·

2024 年 1 月 18 日

Towards Language-Driven Video Inpainting via Multimodal Large Language Models

Jianzong Wu,Xiangtai Li,Chenyang Si,Shangchen Zhou,Jingkang Yang,Jiangning Zhang,Yining Li,Kai Chen,Yunhai Tong,Ziwei Liu,Chen Change Loy

from arxiv, Project Page: //jianzongwu.github.io/projects/rovi

We introduce a new task -- language-driven video inpainting, which uses natural language instructions to guide the inpainting process. This approach overcomes the limitations of traditional video inpainting methods that depend on manually labeled binary masks, a process often tedious and labor-intensive. We present the Remove Objects from Videos by Instructions (ROVI) dataset, containing 5,650 videos and 9,091 inpainting results, to support training and evaluation for this task. We also propose a novel diffusion-based language-driven video inpainting framework, the first end-to-end baseline for this task, integrating Multimodal Large Language Models to understand and execute complex language-based inpainting requests effectively. Our comprehensive results showcase the dataset's versatility and the model's effectiveness in various language-instructed inpainting scenarios. We will make datasets, code, and models publicly available.

Agent · Performer · Processing（編程語言） · 得分 · 可約的 ·

2024 年 1 月 17 日

Rational Decision-Making Agent with Internalized Utility Judgment

Yining Ye,Xin Cong,Shizuo Tian,Yujia Qin,Chong Liu,Yankai Lin,Zhiyuan Liu,Maosong Sun

from arxiv, Received 8,6,6,6 scores on ICLR 2024

Large language models (LLMs) have demonstrated remarkable advancements and have attracted significant efforts to develop LLMs into agents capable of executing intricate multi-step decision-making tasks beyond traditional NLP applications. Existing approaches to LLM-based decision-making predominantly build upon the manually-designed external performance metrics to guide the decision-making process. However, reliance on the external performance metrics as prior is problematic in real-world scenarios, where such prior may be unavailable, flawed, or even erroneous. For genuine autonomous decision making, it is imperative for the agent to develop its rationality from its posterior experiences to judge decisions independently. Central to the development of rationality is the construction of an internalized utility judgment, capable of assigning numerical utilities to each decision. This paper proposes RadAgent (Rational Decision-Making Agent), which fosters the development of its rationality through an iterative framework involving Experience Exploration and Utility Learning. Within this framework, Elo-based Utility Construction is devised to assign Elo scores to individual decision steps to judge their utilities via pairwise comparisons. Consequently, these Elo scores guide the decision-making process to derive optimal outcomes. Experimental results on the ToolBench dataset demonstrate RadAgent's superiority over baselines, achieving over 10% improvement in Pass Rate on diverse tasks. It offers higher-quality solutions and reduces costs (ChatGPT API calls), highlighting its effectiveness and efficiency.

語言模型化 · 大語言模型 · GPT-4 · MoDELS · Performer ·

2024 年 1 月 16 日

Large Language Model Displays Emergent Ability to Interpret Novel Literary Metaphors

Nicholas Ichien,Du?an Stamenkovi?,Keith J. Holyoak

Recent advances in the performance of large language models (LLMs) have sparked debate over whether, given sufficient training, high-level human abilities emerge in such generic forms of artificial intelligence (AI). Despite the exceptional performance of LLMs on a wide range of tasks involving natural language processing and reasoning, there has been sharp disagreement as to whether their abilities extend to more creative human abilities. A core example is the ability to interpret novel metaphors. Given the enormous and non curated text corpora used to train LLMs, a serious obstacle to designing tests is the requirement of finding novel yet high quality metaphors that are unlikely to have been included in the training data. Here we assessed the ability of GPT4, a state of the art large language model, to provide natural-language interpretations of novel literary metaphors drawn from Serbian poetry and translated into English. Despite exhibiting no signs of having been exposed to these metaphors previously, the AI system consistently produced detailed and incisive interpretations. Human judges, blind to the fact that an AI model was involved, rated metaphor interpretations generated by GPT4 as superior to those provided by a group of college students. In interpreting reversed metaphors, GPT4, as well as humans, exhibited signs of sensitivity to the Gricean cooperative principle. In addition, for several novel English poems GPT4 produced interpretations that were rated as excellent or good by a human literary critic. These results indicate that LLMs such as GPT4 have acquired an emergent ability to interpret complex metaphors, including those embedded in novel poems.

估計/估計量 · contrastive · INFORMS · 互信息 · 表示學習 ·

2021 年 6 月 25 日

Decomposed Mutual Information Estimation for Contrastive Representation Learning

Alessandro Sordoni,Nouha Dziri,Hannes Schulz,Geoff Gordon,Phil Bachman,Remi Tachet

from arxiv, ICML 2021

Recent contrastive representation learning methods rely on estimating mutual information (MI) between multiple views of an underlying context. E.g., we can derive multiple views of a given image by applying data augmentation, or we can split a sequence into views comprising the past and future of some step in the sequence. Contrastive lower bounds on MI are easy to optimize, but have a strong underestimation bias when estimating large amounts of MI. We propose decomposing the full MI estimation problem into a sum of smaller estimation problems by splitting one of the views into progressively more informed subviews and by applying the chain rule on MI between the decomposed views. This expression contains a sum of unconditional and conditional MI terms, each measuring modest chunks of the total MI, which facilitates approximation via contrastive bounds. To maximize the sum, we formulate a contrastive lower bound on the conditional MI which can be approximated efficiently. We refer to our general approach as Decomposed Estimation of Mutual Information (DEMI). We show that DEMI can capture a larger amount of MI than standard non-decomposed contrastive bounds in a synthetic setting, and learns better representations in a vision domain and for dialogue generation.

圖卷積神經網絡/圖卷積網絡 · 圖 · entity · 圖卷積 · 卷積 ·

2021 年 4 月 23 日

Knowledge Embedding Based Graph Convolutional Network

Donghan Yu,Yiming Yang,Ruohong Zhang,Yuexin Wu

from arxiv, WWW 2021

Recently, a considerable literature has grown up around the theme of Graph Convolutional Network (GCN). How to effectively leverage the rich structural information in complex graphs, such as knowledge graphs with heterogeneous types of entities and relations, is a primary open challenge in the field. Most GCN methods are either restricted to graphs with a homogeneous type of edges (e.g., citation links only), or focusing on representation learning for nodes only instead of jointly propagating and updating the embeddings of both nodes and edges for target-driven objectives. This paper addresses these limitations by proposing a novel framework, namely the Knowledge Embedding based Graph Convolutional Network (KE-GCN), which combines the power of GCNs in graph-based belief propagation and the strengths of advanced knowledge embedding (a.k.a. knowledge graph embedding) methods, and goes beyond. Our theoretical analysis shows that KE-GCN offers an elegant unification of several well-known GCN methods as specific cases, with a new perspective of graph convolution. Experimental results on benchmark datasets show the advantageous performance of KE-GCN over strong baseline methods in the tasks of knowledge graph alignment and entity classification.

可辨認的 · 目標檢測 · 可約的 · 示例 · 學成 ·

2021 年 3 月 3 日

Towards Open World Object Detection

K J Joseph,Salman Khan,Fahad Shahbaz Khan,Vineeth N Balasubramanian

from arxiv, To appear in CVPR 2021 as an ORAL paper. Code is available in //github.com/JosephKJ/OWOD

Humans have a natural instinct to identify unknown object instances in their environments. The intrinsic curiosity about these unknown instances aids in learning about them, when the corresponding knowledge is eventually available. This motivates us to propose a novel computer vision problem called: `Open World Object Detection', where a model is tasked to: 1) identify objects that have not been introduced to it as `unknown', without explicit supervision to do so, and 2) incrementally learn these identified unknown categories without forgetting previously learned classes, when the corresponding labels are progressively received. We formulate the problem, introduce a strong evaluation protocol and provide a novel solution, which we call ORE: Open World Object Detector, based on contrastive clustering and energy based unknown identification. Our experimental evaluation and ablation studies analyze the efficacy of ORE in achieving Open World objectives. As an interesting by-product, we find that identifying and characterizing unknown instances helps to reduce confusion in an incremental object detection setting, where we achieve state-of-the-art performance, with no extra methodological effort. We hope that our work will attract further research into this newly identified, yet crucial research direction.

圖卷積神經網絡/圖卷積網絡 · 圖卷積 · 圖 · Networking · 可約的 ·

2019 年 2 月 19 日

Simplifying Graph Convolutional Networks

Felix Wu,Tianyi Zhang,Amauri Holanda de Souza Jr.,Christopher Fifty,Tao Yu,Kilian Q. Weinberger

from arxiv, Code available at //github.com/Tiiiger/SGC

Graph Convolutional Networks (GCNs) and their variants have experienced significant attention and have become the de facto methods for learning graph representations. GCNs derive inspiration primarily from recent deep learning approaches, and as a result, may inherit unnecessary complexity and redundant computation. In this paper, we reduce this excess complexity through successively removing nonlinearities and collapsing weight matrices between consecutive layers. We theoretically analyze the resulting linear model and show that it corresponds to a fixed low-pass filter followed by a linear classifier. Notably, our experimental evaluation demonstrates that these simplifications do not negatively impact accuracy in many downstream applications. Moreover, the resulting model scales to larger datasets, is naturally interpretable, and yields up to two orders of magnitude speedup over FastGCN.

矩陣微積分 · 可理解性 · 學成 · Neural Networks · Networks ·

2018 年 7 月 2 日

The Matrix Calculus You Need For Deep Learning

Terence Parr,Jeremy Howard

from arxiv, PDF version of mobile/web friendly version //explained.ai/matrix-calculus/index.html

This paper is an attempt to explain all the matrix calculus you need in order to understand the training of deep neural networks. We assume no math knowledge beyond what you learned in calculus 1, and provide links to help you refresh the necessary math where needed. Note that you do not need to understand this material before you start learning to train and use deep learning in practice; rather, this material is for those who are already familiar with the basics of neural networks, and wish to deepen their understanding of the underlying math. Don't worry if you get stuck at some point along the way---just go back and reread the previous section, and try writing down and working through some examples. And if you're still stuck, we're happy to answer your questions in the Theory category at forums.fast.ai. Note: There is a reference section at the end of the paper summarizing all the key matrix calculus rules and terminology discussed here. See related articles at //explained.ai

Neural Networks · Networking · 卷積 · 卷積神經網絡 · Extensibility ·

2018 年 6 月 27 日

Bayesian Convolutional Neural Networks

Felix Laumann,Kumar Shridhar,Adrian Llopart Maurin

We propose a Bayesian convolutional neural network built upon Bayes by Backprop and elaborate how this known method can serve as the fundamental construct of our novel, reliable variational inference method for convolutional neural networks. First, we show how Bayes by Backprop can be applied to convolutional layers where weights in filters have probability distributions instead of point-estimates; and second, how our proposed framework leads with various network architectures to performances comparable to convolutional neural networks with point-estimates weights. In the past, Bayes by Backprop has been successfully utilised in feedforward and recurrent neural networks, but not in convolutional ones. This work symbolises the extension of the group of Bayesian neural networks which encompasses all three aforementioned types of network architectures now.