亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<tr id='m2va4'><strong id='m2va4'></strong><small id='m2va4'></small><button id='m2va4'></button><li id='m2va4'><noscript id='m2va4'><big id='m2va4'></big><dt id='m2va4'></dt></noscript></li></tr><ol id='m2va4'><option id='m2va4'><table id='m2va4'><blockquote id='m2va4'><tbody id='m2va4'></tbody></blockquote></table></option></ol><u id='m2va4'></u><kbd id='m2va4'><kbd id='m2va4'></kbd></kbd>

<code id='m2va4'><strong id='m2va4'></strong></code>

<fieldset id='m2va4'></fieldset>

<span id='m2va4'></span>

<ins id='m2va4'></ins>

<acronym id='m2va4'><em id='m2va4'></em><td id='m2va4'><div id='m2va4'></div></td></acronym><address id='m2va4'><big id='m2va4'><big id='m2va4'></big><legend id='m2va4'></legend></big></address>

<i id='m2va4'><div id='m2va4'><ins id='m2va4'></ins></div></i>

<i id='m2va4'></i>

·

INTERACT · Agent · 語言模型化 · 虛擬現實（VR） · 回合 ·

2023 年 9 月 29 日

Voice2Action: Language Models as Agent for Efficient Real-Time Interaction in Virtual Reality

Large Language Models (LLMs) are trained and aligned to follow natural language instructions with only a handful of examples, and they are prompted as task-driven autonomous agents to adapt to various sources of execution environments. However, deploying agent LLMs in virtual reality (VR) has been challenging due to the lack of efficiency in online interactions and the complex manipulation categories in 3D environments. In this work, we propose Voice2Action, a framework that hierarchically analyzes customized voice signals and textual commands through action and entity extraction and divides the execution tasks into canonical interaction subsets in real-time with error prevention from environment feedback. Experiment results in an urban engineering VR environment with synthetic instruction data show that Voice2Action can perform more efficiently and accurately than approaches without optimizations.

相關內容

INTERACT

IFIP TC13 Conference on Human-Computer Interaction是人機交互領域的研究者和實踐者展示其工作的重要平臺。多年來，這些會議吸引了來自幾個國家和文化的研究人員。官網鏈接： · 概率圖模型 · Cognition · MoDELS · Seven ·

2023 年 11 月 16 日

MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration

Lin Xu,Zhiyuan Hu,Daquan Zhou,Hongyu Ren,Zhen Dong,Kurt Keutzer,See Kiong Ng,Jiashi Feng

from arxiv, work in progress

Large Language Models (LLMs) have marked a significant advancement in the field of natural language processing, demonstrating exceptional capabilities in reasoning, tool usage, and memory. As their applications extend into multi-agent environments, a need has arisen for a comprehensive evaluation framework that captures their abilities in reasoning, planning, collaboration, and more. This work introduces a novel benchmarking framework specifically tailored to assess LLMs within multi-agent settings, providing quantitative metrics to evaluate their judgment, reasoning, deception, self-awareness, cooperation, coordination, and rationality. We utilize games such as Chameleon and Undercover, alongside game theory scenarios like Cost Sharing, Multi-player Prisoner's Dilemma, and Public Good, to create diverse testing environments. Our framework is fortified with the Probabilistic Graphical Modeling (PGM) method, enhancing the LLMs' capabilities in navigating complex social and cognitive dimensions. The benchmark evaluates seven multi-agent systems powered by different LLMs, quantitatively highlighting a significant capability gap over threefold between the strongest, GPT-4, and the weakest, Llama-2-70B. It also confirms that our PGM enhancement boosts the inherent abilities of all selected models by 50% on average. Our codes are released here //github.com/cathyxl/MAgIC.

Learning · 泛化理論 · Performer · Branch · 可理解性 ·

2023 年 11 月 16 日

ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning

Junguang Jiang,Baixu Chen,Junwei Pan,Ximei Wang,Liu Dapeng,Jie Jiang,Mingsheng Long

from arxiv, Accepted by NeurIPS 2023

Auxiliary-Task Learning (ATL) aims to improve the performance of the target task by leveraging the knowledge obtained from related tasks. Occasionally, learning multiple tasks simultaneously results in lower accuracy than learning only the target task, which is known as negative transfer. This problem is often attributed to the gradient conflicts among tasks, and is frequently tackled by coordinating the task gradients in previous works. However, these optimization-based methods largely overlook the auxiliary-target generalization capability. To better understand the root cause of negative transfer, we experimentally investigate it from both optimization and generalization perspectives. Based on our findings, we introduce ForkMerge, a novel approach that periodically forks the model into multiple branches, automatically searches the varying task weights by minimizing target validation errors, and dynamically merges all branches to filter out detrimental task-parameter updates. On a series of auxiliary-task learning benchmarks, ForkMerge outperforms existing methods and effectively mitigates negative transfer.

Integration · 多樣性 · 語言模型化 · 可理解性 · 相關系數 ·

2023 年 11 月 15 日

Fusion-Eval: Integrating Evaluators with LLMs

Lei Shu,Nevan Wichers,Liangchen Luo,Yun Zhu,Yinxiao Liu,Jindong Chen,Lei Meng

Evaluating Large Language Models (LLMs) is a complex task, especially considering the intricacies of natural language understanding and the expectations for high-level reasoning. Traditional evaluations typically lean on human-based, model-based, or automatic-metrics-based paradigms, each with its own advantages and shortcomings. We introduce "Fusion-Eval", a system that employs LLMs not solely for direct evaluations, but to skillfully integrate insights from diverse evaluators. This gives Fusion-Eval flexibility, enabling it to work effectively across diverse tasks and make optimal use of multiple references. In testing on the SummEval dataset, Fusion-Eval achieved a Spearman correlation of 0.96, outperforming other evaluators. The success of Fusion-Eval underscores the potential of LLMs to produce evaluations that closely align human perspectives, setting a new standard in the field of LLM evaluation.

MoDELS · Automator · 語言模型化 · Performer · HTTPS ·

2023 年 11 月 15 日

Safer-Instruct: Aligning Language Models with Automated Preference Data

Taiwei Shi,Kai Chen,Jieyu Zhao

from arxiv, 11 pages

Reinforcement Learning from Human Feedback (RLHF) is a vital strategy for enhancing model safety in language models. However, annotating preference data for RLHF is a resource-intensive and creativity-demanding process, while automatic generation methods face limitations in data diversity and quality. In response, we present Safer-Instruct, a novel pipeline for semi-automatically constructing large-scale preference datasets. Our approach leverages reversed instruction tuning, instruction induction, and expert model evaluation to efficiently generate high-quality preference data without human annotators. We evaluate Safer-Instruct using LLaMA for instruction induction and GPT-4 as an expert model, generating approximately 10K preference samples. Finetuning an Alpaca model on this dataset demonstrates improved harmlessness while maintaining competitive performance on conversation and downstream tasks. Safer-Instruct addresses the challenges in preference data acquisition, advancing the development of safer and more responsible AI systems. Our code and data are available at //github.com/uscnlp-lime/safer-instruct

可理解性 · 自動問答 · 數據集 · 圖注意力網絡 · Processing（編程語言） ·

2023 年 11 月 15 日

XplainLLM: A QA Explanation Dataset for Understanding LLM Decision-Making

Zichen Chen,Jianda Chen,Mitali Gaidhani,Ambuj Singh,Misha Sra

from arxiv, 17 pages, 6 figures, 7 tables. Our dataset is available at: //github.com/chen-zichen/XplainLLM_dataset.git

Large Language Models (LLMs) have recently made impressive strides in natural language understanding tasks. Despite their remarkable performance, understanding their decision-making process remains a big challenge. In this paper, we look into bringing some transparency to this process by introducing a new explanation dataset for question answering (QA) tasks that integrates knowledge graphs (KGs) in a novel way. Our dataset includes 12,102 question-answer-explanation (QAE) triples. Each explanation in the dataset links the LLM's reasoning to entities and relations in the KGs. The explanation component includes a why-choose explanation, a why-not-choose explanation, and a set of reason-elements that underlie the LLM's decision. We leverage KGs and graph attention networks (GAT) to find the reason-elements and transform them into why-choose and why-not-choose explanations that are comprehensible to humans. Through quantitative and qualitative evaluations, we demonstrate the potential of our dataset to improve the in-context learning of LLMs, and enhance their interpretability and explainability. Our work contributes to the field of explainable AI by enabling a deeper understanding of the LLMs decision-making process to make them more transparent and thereby, potentially more reliable, to researchers and practitioners alike. Our dataset is available at: //github.com/chen-zichen/XplainLLM_dataset.git

語言模型化 · MoDELS · Extensibility · state-of-the-art · Performance ·

2023 年 11 月 14 日

LLatrieval: LLM-Verified Retrieval for Verifiable Generation

Xiaonan Li,Changtai Zhu,Linyang Li,Zhangyue Yin,Tianxiang Sun,Xipeng Qiu

Verifiable generation aims to let the large language model (LLM) generate text with corresponding supporting documents, which enables the user to flexibly verify the answer and makes it more trustworthy. Its evaluation not only measures the correctness of the answer, but also the answer's verifiability, i.e., how well the answer is supported by the corresponding documents. In typical, verifiable generation adopts the retrieval-read pipeline, which is divided into two stages: 1) retrieve relevant documents of the question. 2) according to the documents, generate the corresponding answer. Since the retrieved documents can supplement knowledge for the LLM to generate the answer and serve as evidence, the retrieval stage is essential for the correctness and verifiability of the answer. However, the widely used retrievers become the bottleneck of the entire pipeline and limit the overall performance. They often have fewer parameters than the large language model and have not been proven to scale well to the size of LLMs. Since the LLM passively receives the retrieval result, if the retriever does not correctly find the supporting documents, the LLM can not generate the correct and verifiable answer, which overshadows the LLM's remarkable abilities. In this paper, we propose LLatrieval (Large Language Model Verified Retrieval), where the LLM updates the retrieval result until it verifies that the retrieved documents can support answering the question. Thus, the LLM can iteratively provide feedback to retrieval and facilitate the retrieval result to sufficiently support verifiable generation. Experimental results show that our method significantly outperforms extensive baselines and achieves new state-of-the-art results.

語言模型化 · 自動問答 · MoDELS · 可約的 · entity ·

2021 年 9 月 22 日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Fu Sun,Feng-Lin Li,Ruize Wang,Qianglong Chen,Xingyi Cheng,Ji Zhang

from arxiv, CIKM 2021

Knowledge enhanced pre-trained language models (K-PLMs) are shown to be effective for many public tasks in the literature but few of them have been successfully applied in practice. To address this problem, we propose K-AID, a systematic approach that includes a low-cost knowledge acquisition process for acquiring domain knowledge, an effective knowledge infusion module for improving model performance, and a knowledge distillation component for reducing the model size and deploying K-PLMs on resource-restricted devices (e.g., CPU) for real-world application. Importantly, instead of capturing entity knowledge like the majority of existing K-PLMs, our approach captures relational knowledge, which contributes to better-improving sentence-level text classification and text matching tasks that play a key role in question answering (QA). We conducted a set of experiments on five text classification tasks and three text matching tasks from three domains, namely E-commerce, Government, and Film&TV, and performed online A/B tests in E-commerce. Experimental results show that our approach is able to achieve substantial improvement on sentence-level question answering tasks and bring beneficial business value in industrial settings.

損失函數（機器學習） · 學習的學習 · 學成 · entity · 泛函 ·

2019 年 9 月 9 日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Jiawei Wu,Wenhan Xiong,William Yang Wang

from arxiv, 11pages, 5 figures, accepted to EMNLP 2019

Many tasks in natural language processing can be viewed as multi-label classification problems. However, most of the existing models are trained with the standard cross-entropy loss function and use a fixed prediction policy (e.g., a threshold of 0.5) for all the labels, which completely ignores the complexity and dependencies among different labels. In this paper, we propose a meta-learning method to capture these complex label dependencies. More specifically, our method utilizes a meta-learner to jointly learn the training policies and prediction policies for different labels. The training policies are then used to train the classifier with the cross-entropy loss function, and the prediction policies are further implemented for prediction. Experimental results on fine-grained entity typing and text classification demonstrate that our proposed method can obtain more accurate multi-label classification results.

SOFT · 硬性注意力 · 注意力機制 · Performer · MoDELS ·

2018 年 1 月 31 日

Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling

Tao Shen,Tianyi Zhou,Guodong Long,Jing Jiang,Sen Wang,Chengqi Zhang

from arxiv, 12 pages, 3 figures

Many natural language processing tasks solely rely on sparse dependencies between a few tokens in a sentence. Soft attention mechanisms show promising performance in modeling local/global dependencies by soft probabilities between every two tokens, but they are not effective and efficient when applied to long sentences. By contrast, hard attention mechanisms directly select a subset of tokens but are difficult and inefficient to train due to their combinatorial nature. In this paper, we integrate both soft and hard attention into one context fusion model, "reinforced self-attention (ReSA)", for the mutual benefit of each other. In ReSA, a hard attention trims a sequence for a soft self-attention to process, while the soft attention feeds reward signals back to facilitate the training of the hard one. For this purpose, we develop a novel hard attention called "reinforced sequence sampling (RSS)", selecting tokens in parallel and trained via policy gradient. Using two RSS modules, ReSA efficiently extracts the sparse dependencies between each pair of selected tokens. We finally propose an RNN/CNN-free sentence-encoding model, "reinforced self-attention network (ReSAN)", solely based on ReSA. It achieves state-of-the-art performance on both Stanford Natural Language Inference (SNLI) and Sentences Involving Compositional Knowledge (SICK) datasets.

MoDELS · 標注 · 詞元分析器 · INFORMS · state-of-the-art ·

2017 年 12 月 5 日

Deep Semantic Role Labeling with Self-Attention

Zhixing Tan,Mingxuan Wang,Jun Xie,Yidong Chen,Xiaodong Shi

from arxiv, Accepted by AAAI-2018

Semantic Role Labeling (SRL) is believed to be a crucial step towards natural language understanding and has been widely studied. Recent years, end-to-end SRL with recurrent neural networks (RNN) has gained increasing attention. However, it remains a major challenge for RNNs to handle structural information and long range dependencies. In this paper, we present a simple and effective architecture for SRL which aims to address these problems. Our model is based on self-attention which can directly capture the relationships between two tokens regardless of their distance. Our single model achieves F$_1=83.4$ on the CoNLL-2005 shared task dataset and F$_1=82.7$ on the CoNLL-2012 shared task dataset, which outperforms the previous state-of-the-art results by $1.8$ and $1.0$ F$_1$ score respectively. Besides, our model is computationally efficient, and the parsing speed is 50K tokens per second on a single Titan X GPU.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

語言模型化

虛擬現實（VR）

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tr id='m2va4'><strong id='m2va4'></strong><small id='m2va4'></small><button id='m2va4'></button><li id='m2va4'><noscript id='m2va4'><big id='m2va4'></big><dt id='m2va4'></dt></noscript></li></tr><ol id='m2va4'><option id='m2va4'><table id='m2va4'><blockquote id='m2va4'><tbody id='m2va4'></tbody></blockquote></table></option></ol><u id='m2va4'></u><kbd id='m2va4'><kbd id='m2va4'></kbd></kbd>

<code id='m2va4'><strong id='m2va4'></strong></code>

<fieldset id='m2va4'></fieldset>

<span id='m2va4'></span>

<ins id='m2va4'></ins>

<acronym id='m2va4'><em id='m2va4'></em><td id='m2va4'><div id='m2va4'></div></td></acronym><address id='m2va4'><big id='m2va4'><big id='m2va4'></big><legend id='m2va4'></legend></big></address>

<i id='m2va4'><div id='m2va4'><ins id='m2va4'></ins></div></i>

<i id='m2va4'></i>