五月丁香四月婷婷激情综合_午夜欧美不卡AAAA精品观看_超碰亚洲中文字幕在线导航_欧美日韩一本大道香蕉_中文字幕一区二区三区免费大片_久久亚洲中文字幕不卡一区二区_久久综合伊人噜噜色

The advantages of pre-trained large language models (LLMs) are apparent in a variety of language processing tasks. But can a language model's knowledge be further harnessed to effectively disambiguate objects and navigate decision-making challenges within the realm of robotics? Our study reveals the LLM's aptitude for solving complex decision making challenges that are often previously modeled by Partially Observable Markov Decision Processes (POMDPs). A pivotal focus of our research is the object disambiguation capability of LLMs. We detail the integration of an LLM into a tabletop environment disambiguation task, a decision making problem where the robot's task is to discern and retrieve a user's desired object from an arbitrarily large and complex cluster of objects. Despite multiple query attempts with zero-shot prompt engineering (details can be found in the Appendix), the LLM struggled to inquire about features not explicitly provided in the scene description. In response, we have developed a few-shot prompt engineering system to improve the LLM's ability to pose disambiguating queries. The result is a model capable of both using given features when they are available and inferring new relevant features when necessary, to successfully generate and navigate down a precise decision tree to the correct object--even when faced with identical options.

相關內容

大語言模型(xing)

關注 56

大(da)語(yu)(yu)(yu)言(yan)模型是基于海量(liang)文(wen)本(ben)(ben)(ben)數(shu)據訓練的(de)(de)深(shen)度學習模型。它不(bu)僅能(neng)夠生(sheng)(sheng)(sheng)成(cheng)自然(ran)語(yu)(yu)(yu)言(yan)文(wen)本(ben)(ben)(ben)，還(huan)能(neng)夠深(shen)入(ru)理(li)解文(wen)本(ben)(ben)(ben)含義，處理(li)各種(zhong)自然(ran)語(yu)(yu)(yu)言(yan)任(ren)務(wu)，如文(wen)本(ben)(ben)(ben)摘要、問答(da)、翻譯等。2023年，大(da)語(yu)(yu)(yu)言(yan)模型及(ji)其(qi)在(zai)人工智(zhi)能(neng)領域(yu)的(de)(de)應用(yong)已成(cheng)為全球科技研究的(de)(de)熱點，其(qi)在(zai)規模上的(de)(de)增長(chang)尤為引人注目，參(can)數(shu)量(liang)已從最初的(de)(de)十幾億躍(yue)升(sheng)到如今的(de)(de)一萬億。參(can)數(shu)量(liang)的(de)(de)提升(sheng)使得模型能(neng)夠更加精(jing)細地(di)捕捉(zhuo)人類(lei)語(yu)(yu)(yu)言(yan)微妙之處，更加深(shen)入(ru)地(di)理(li)解人類(lei)語(yu)(yu)(yu)言(yan)的(de)(de)復雜(za)(za)性。在(zai)過(guo)去(qu)的(de)(de)一年里(li)，大(da)語(yu)(yu)(yu)言(yan)模型在(zai)吸納新(xin)知識、分解復雜(za)(za)任(ren)務(wu)以及(ji)圖文(wen)對齊等多方面都(dou)有顯著提升(sheng)。隨著技術(shu)的(de)(de)不(bu)斷成(cheng)熟，它將(jiang)不(bu)斷拓展其(qi)應用(yong)范圍，為人類(lei)提供更加智(zhi)能(neng)化和個性化的(de)(de)服務(wu)，進一步改善(shan)人們的(de)(de)生(sheng)(sheng)(sheng)活和生(sheng)(sheng)(sheng)產方式。

知識 (knowledge) · MoDELS · 大語言模型 · 圖 · 語言模型化 ·

2024 年 2 月 21 日

Knowledge Graph Enhanced Large Language Model Editing

Mengqi Zhang,Xiaotian Ye,Qiang Liu,Pengjie Ren,Shu Wu,Zhumin Chen

Large language models (LLMs) are pivotal in advancing natural language processing (NLP) tasks, yet their efficacy is hampered by inaccuracies and outdated knowledge. Model editing emerges as a promising solution to address these challenges. However, existing editing methods struggle to track and incorporate changes in knowledge associated with edits, which limits the generalization ability of postedit LLMs in processing edited knowledge. To tackle these problems, we propose a novel model editing method that leverages knowledge graphs for enhancing LLM editing, namely GLAME. Specifically, we first utilize a knowledge graph augmentation module to uncover associated knowledge that has changed due to editing, obtaining its internal representations within LLMs. This approach allows knowledge alterations within LLMs to be reflected through an external graph structure. Subsequently, we design a graph-based knowledge edit module to integrate structured knowledge into the model editing. This ensures that the updated parameters reflect not only the modifications of the edited knowledge but also the changes in other associated knowledge resulting from the editing process. Comprehensive experiments conducted on GPT-J and GPT-2 XL demonstrate that GLAME significantly improves the generalization capabilities of post-edit LLMs in employing edited knowledge.

MoDELS · 大語言模型 · 訓練數據 · Hacking · 估計/估計量 ·

2024 年 2 月 20 日

Bayesian Reward Models for LLM Alignment

Adam X. Yang,Maxime Robeyns,Thomas Coste,Jun Wang,Haitham Bou-Ammar,Laurence Aitchison

To ensure that large language model (LLM) responses are helpful and non-toxic, we usually fine-tune a reward model on human preference data. We then select policy responses with high rewards (best-of-n sampling) or further optimize the policy to produce responses with high rewards (reinforcement learning from human feedback). However, this process is vulnerable to reward overoptimization or hacking, in which the responses selected have high rewards due to errors in the reward model rather than a genuine preference. This is especially problematic as the prompt or response diverges from the training data. It should be possible to mitigate these issues by training a Bayesian reward model, which signals higher uncertainty further from the training data distribution. Therefore, we trained Bayesian reward models using Laplace-LoRA (Yang et al., 2024) and found that the resulting uncertainty estimates can successfully mitigate reward overoptimization in best-of-n sampling.

規范化的 · 轉錄 · 自然語言處理 ·

2024 年 2 月 20 日

Normalized Orthography for Tunisian Arabic

Houcemeddine Turki,Kawthar Ellouze,Hager Ben Ammar,Mohamed Ali Hadj Taieb,Imed Adel,Mohamed Ben Aouicha,Pier Luigi Farri,Abderrezak Bennour

from arxiv, Final Report for the Derja Association

Tunisian Arabic (ISO 693-3: aeb) is a distinct linguistic variety native to Tunisia, initially stemmed from the Arabic language and enriched by a multitude of historical influences. This research introduces the "Normalized Orthography for Tunisian Arabic" (NOTA), an adaptation of CODA* guidelines tailored for transcribing Tunisian Arabic using the Arabic script for language resource development purposes, with an emphasis on user-friendliness and consistency. The updated standard seeks to address challenges related to accurately representing the unique characteristics of Tunisian phonology and morphology. This will be achieved by rectifying problems arising from transcriptions based on resemblances to Modern Standard Arabic.

MoDELS · 流形 · 多樣性 · 控制器 · Subspace ·

2024 年 2 月 19 日

Mixed Gaussian Flow for Diverse Trajectory Prediction

Jiahe Chen,Jinkun Cao,Dahua Lin,Kris Kitani,Jiangmiao Pang

Existing trajectory prediction studies intensively leverage generative models. Normalizing flow is one of the genres with the advantage of being invertible to derive the probability density of predicted trajectories. However, mapping from a standard Gaussian by a flow-based model hurts the capacity to capture complicated patterns of trajectories, ignoring the under-represented motion intentions in the training data. To solve the problem, we propose a flow-based model to transform a mixed Gaussian prior into the future trajectory manifold. The model shows a better capacity for generating diverse trajectory patterns. Also, by associating each sub-Gaussian with a certain subspace of trajectories, we can generate future trajectories with controllable motion intentions. In such a fashion, the flow-based model is not encouraged to simply seek the most likelihood of the intended manifold anymore but a family of controlled manifolds with explicit interpretability. Our proposed method is demonstrated to show state-of-the-art performance in the quantitative evaluation of sampling well-aligned trajectories in top-M generated candidates. We also demonstrate that it can generate diverse, controllable, and out-of-distribution trajectories. Code is available at //github.com/mulplue/MGF.

Agent · Automator · 大語言模型 · Cognition · CEP架構 ·

2024 年 2 月 19 日

Comprehensive Cognitive LLM Agent for Smartphone GUI Automation

Xinbei Ma,Zhuosheng Zhang,Hai Zhao

Large language models (LLMs) have shown remarkable potential as human-like autonomous language agents to interact with real-world environments, especially for graphical user interface (GUI) automation. However, those GUI agents require comprehensive cognition ability including exhaustive perception and reliable action response. We propose \underline{Co}mprehensive \underline{Co}gnitive LLM \underline{Agent}, CoCo-Agent, with two novel approaches, comprehensive environment perception (CEP) and conditional action prediction (CAP), to systematically improve the GUI automation performance. First, CEP facilitates the GUI perception through different aspects and granularity, including screenshots and complementary detailed layouts for the visual channel and historical actions for the textual channel. Second, CAP decomposes the action prediction into sub-problems: action type prediction and action target conditioned on the action type. With our technical design, our agent achieves new state-of-the-art performance on AITW and META-GUI benchmarks, showing promising abilities in realistic scenarios.

離散化 · Learning · 自編碼器 · 損失 · MoDELS ·

2024 年 2 月 16 日

Symbolic Autoencoding for Self-Supervised Sequence Learning

Mohammad Hossein Amani,Nicolas Mario Baldwin,Amin Mansouri,Martin Josifoski,Maxime Peyrard,Robert West

Traditional language models, adept at next-token prediction in text sequences, often struggle with transduction tasks between distinct symbolic systems, particularly when parallel data is scarce. Addressing this issue, we introduce \textit{symbolic autoencoding} ($\Sigma$AE), a self-supervised framework that harnesses the power of abundant unparallel data alongside limited parallel data. $\Sigma$AE connects two generative models via a discrete bottleneck layer and is optimized end-to-end by minimizing reconstruction loss (simultaneously with supervised loss for the parallel data), such that the sequence generated by the discrete bottleneck can be read out as the transduced input sequence. We also develop gradient-based methods allowing for efficient self-supervised sequence learning despite the discreteness of the bottleneck. Our results demonstrate that $\Sigma$AE significantly enhances performance on transduction tasks, even with minimal parallel data, offering a promising solution for weakly supervised learning scenarios.

MoDELS · Performer · Better · 過估計 · 可約的 ·

2024 年 2 月 15 日

A StrongREJECT for Empty Jailbreaks

Alexandra Souly,Qingyuan Lu,Dillon Bowen,Tu Trinh,Elvis Hsieh,Sana Pandey,Pieter Abbeel,Justin Svegliato,Scott Emmons,Olivia Watkins,Sam Toyer

from arxiv, Code and data at //github.com/alexandrasouly/strongreject

The rise of large language models (LLMs) has drawn attention to the existence of "jailbreaks" that allow the models to be used maliciously. However, there is no standard benchmark for measuring the severity of a jailbreak, leaving authors of jailbreak papers to create their own. We show that these benchmarks often include vague or unanswerable questions and use grading criteria that are biased towards overestimating the misuse potential of low-quality model responses. Some jailbreak techniques make the problem worse by decreasing the quality of model responses even on benign questions: we show that several jailbreaking techniques substantially reduce the zero-shot performance of GPT-4 on MMLU. Jailbreaks can also make it harder to elicit harmful responses from an "uncensored" open-source model. We present a new benchmark, StrongREJECT, which better discriminates between effective and ineffective jailbreaks by using a higher-quality question set and a more accurate response grading algorithm. We show that our new grading scheme better accords with human judgment of response quality and overall jailbreak effectiveness, especially on the sort of low-quality responses that contribute the most to over-estimation of jailbreak performance on existing benchmarks. We release our code and data at //github.com/alexandrasouly/strongreject.

MoDELS · Taxonomy · 語言模型化 · 可理解性 · Performance ·

2023 年 9 月 2 日

Explainability for Large Language Models: A Survey

Haiyan Zhao,Hanjie Chen,Fan Yang,Ninghao Liu,Huiqi Deng,Hengyi Cai,Shuaiqiang Wang,Dawei Yin,Mengnan Du

Large language models (LLMs) have demonstrated impressive capabilities in natural language processing. However, their internal mechanisms are still unclear and this lack of transparency poses unwanted risks for downstream applications. Therefore, understanding and explaining these models is crucial for elucidating their behaviors, limitations, and social impacts. In this paper, we introduce a taxonomy of explainability techniques and provide a structured overview of methods for explaining Transformer-based language models. We categorize techniques based on the training paradigms of LLMs: traditional fine-tuning-based paradigm and prompting-based paradigm. For each paradigm, we summarize the goals and dominant approaches for generating local explanations of individual predictions and global explanations of overall model knowledge. We also discuss metrics for evaluating generated explanations, and discuss how explanations can be leveraged to debug models and improve performance. Lastly, we examine key challenges and emerging opportunities for explanation techniques in the era of LLMs in comparison to conventional machine learning models.

MoDELS · 語言模型化 · 任務對話系統 · 話題 · Vision ·

2022 年 3 月 26 日

A Roadmap for Big Model

Sha Yuan,Hanyu Zhao,Shuai Zhao,Jiahong Leng,Yangxiao Liang,Xiaozhi Wang,Jifan Yu,Xin Lv,Zhou Shao,Jiaao He,Yankai Lin,Xu Han,Zhenghao Liu,Ning Ding,Yongming Rao,Yizhao Gao,Liang Zhang,Ming Ding,Cong Fang,Yisen Wang,Mingsheng Long,Jing Zhang,Yinpeng Dong,Tianyu Pang,Peng Cui,Lingxiao Huang,Zheng Liang,Huawei Shen,Hui Zhang,Quanshi Zhang,Qingxiu Dong,Zhixing Tan,Mingxuan Wang,Shuo Wang,Long Zhou,Haoran Li,Junwei Bao,Yingwei Pan,Weinan Zhang,Zhou Yu,Rui Yan,Chence Shi,Minghao Xu,Zuobai Zhang,Guoqiang Wang,Xiang Pan,Mengjie Li,Xiaoyu Chu,Zijun Yao,Fangwei Zhu,Shulin Cao,Weicheng Xue,Zixuan Ma,Zhengyan Zhang,Shengding Hu,Yujia Qin,Chaojun Xiao,Zheni Zeng,Ganqu Cui,Weize Chen,Weilin Zhao,Yuan Yao,Peng Li,Wenzhao Zheng,Wenliang Zhao,Ziyi Wang,Borui Zhang,Nanyi Fei,Anwen Hu,Zenan Ling,Haoyang Li,Boxi Cao,Xianpei Han,Weidong Zhan,Baobao Chang,Hao Sun,Jiawen Deng,Juanzi Li,Lei Hou,Xigang Cao,Jidong Zhai,Zhiyuan Liu,Maosong Sun,Jiwen Lu,Zhiwu Lu,Qin Jin,Ruihua Song,Ji-Rong Wen,Zhouchen Lin,Liwei Wang,Hang Su,Jun Zhu,Zhifang Sui,Jiajun Zhang,Yang Liu,Xiaodong He,Minlie Huang,Jian Tang,Jie Tang

With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm. Researchers have achieved various outcomes in the construction of BMs and the BM application in many fields. At present, there is a lack of research work that sorts out the overall progress of BMs and guides the follow-up research. In this paper, we cover not only the BM technologies themselves but also the prerequisites for BM training and applications with BMs, dividing the BM review into four parts: Resource, Models, Key Technologies and Application. We introduce 16 specific BM-related topics in those four parts, they are Data, Knowledge, Computing System, Parallel Training System, Language Model, Vision Model, Multi-modal Model, Theory&Interpretability, Commonsense Reasoning, Reliability&Security, Governance, Evaluation, Machine Translation, Text Generation, Dialogue and Protein Research. In each topic, we summarize clearly the current studies and propose some future research directions. At the end of this paper, we conclude the further development of BMs in a more general view.

圖卷積神經網絡/圖卷積網絡 · 文本分類 · 圖卷積網絡 · 圖卷積 · 圖 ·

2018 年 11 月 13 日

Graph Convolutional Networks for Text Classification

Liang Yao,Chengsheng Mao,Yuan Luo

from arxiv, Accepted by 33rd AAAI Conference on Artificial Intelligence (AAAI 2019)

Text classification is an important and classical problem in natural language processing. There have been a number of studies that applied convolutional neural networks (convolution on regular grid, e.g., sequence) to classification. However, only a limited number of studies have explored the more flexible graph convolutional neural networks (convolution on non-grid, e.g., arbitrary graph) for the task. In this work, we propose to use graph convolutional networks for text classification. We build a single text graph for a corpus based on word co-occurrence and document word relations, then learn a Text Graph Convolutional Network (Text GCN) for the corpus. Our Text GCN is initialized with one-hot representation for word and document, it then jointly learns the embeddings for both words and documents, as supervised by the known class labels for documents. Our experimental results on multiple benchmark datasets demonstrate that a vanilla Text GCN without any external word embeddings or knowledge outperforms state-of-the-art methods for text classification. On the other hand, Text GCN also learns predictive word and document embeddings. In addition, experimental results show that the improvement of Text GCN over state-of-the-art comparison methods become more prominent as we lower the percentage of training data, suggesting the robustness of Text GCN to less training data in text classification.