亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<dir id='RzzCJ'><del id='96ARD'><del id='LuKi3'></del><pre id='ZDVx5'><pre id='sumHY'><option id='rkI01'><address id='a9F0a'></address><bdo id='LCyR5'><tr id='Ug0pG'><acronym id='xejOZ'><pre id='duFW5'></pre></acronym><div id='wm3Nr'></div></tr></bdo></option></pre><small id='XwB1C'><address id='MZLiq'><u id='xfeDj'><legend id='lv7aB'><option id='ma5jX'><abbr id='QDSeN'></abbr><li id='yJUAg'><pre id='xPjX4'></pre></li></option></legend><select id='IDayB'></select></u></address></small></pre></del><sup id='8qbMS'></sup><blockquote id='ap3me'><dt id='WCrcV'></dt></blockquote><blockquote id='jgGE2'></blockquote></dir><tt id='XuNxV'></tt><u id='Gljg3'><tt id='TaFS1'><form id='ABfwN'></form></tt><td id='iRIvt'><dt id='5xhcU'></dt></td></u>

<code id='FigHk'><i id='MdEIw'><q id='hHJD6'><legend id='PlDdf'><pre id='qhO7y'><style id='NJrXZ'><acronym id='0qs4I'><i id='ezjt4'><form id='E8niT'><option id='lbJy4'><center id='0WgFO'></center></option></form></i></acronym></style><tt id='xDkiH'></tt></pre></legend></q></i></code><center id='cWRSx'></center>

<dd id='0SoN0'></dd>

<style id='u1I1W'></style><sub id='YKZes'><dfn id='dkGIF'><abbr id='qjdPY'><big id='vqlXs'><bdo id='UF2Ex'></bdo></big></abbr></dfn></sub>_{<dir id='tSvRa'></dir>}

·

視覺問答 · MoDELS · 自動問答 · 可約的 · Extensibility ·

2023 年 3 月 16 日

Logical Implications for Visual Question Answering Consistency

Sergio Tascon-Morales,Pablo Márquez-Neila,Raphael Sznitman

Despite considerable recent progress in Visual Question Answering (VQA) models, inconsistent or contradictory answers continue to cast doubt on their true reasoning capabilities. However, most proposed methods use indirect strategies or strong assumptions on pairs of questions and answers to enforce model consistency. Instead, we propose a novel strategy intended to improve model performance by directly reducing logical inconsistencies. To do this, we introduce a new consistency loss term that can be used by a wide range of the VQA models and which relies on knowing the logical relation between pairs of questions and answers. While such information is typically not available in VQA datasets, we propose to infer these logical relations using a dedicated language model and use these in our proposed consistency loss function. We conduct extensive experiments on the VQA Introspect and DME datasets and show that our method brings improvements to state-of-the-art VQA models, while being robust across different architectures and settings.

相關內容

視覺問答

視覺問答（Visual Question Answering，VQA），是一種涉及計算機視覺和自然語言處理的學習任務。這一任務的定義如下： A VQA system takes as input an image and a free-form, open-ended, natural-language question about the image and produces a natural-language answer as the output[1]。翻譯為中文：一個VQA系統以一張圖片和一個關于這張圖片形式自由、開放式的自然語言問題作為輸入，以生成一條自然語言答案作為輸出。簡單來說，VQA就是給定的圖片進行問答。

知識薈萃

精品入門和進階教程、論文和代碼整理等

更多

查看相關VIP內容、論文、資訊等

MoDELS · 協同過濾 · 相似度 · 可辨認的 · on the fly ·

2023 年 5 月 8 日

WSFE: Wasserstein Sub-graph Feature Encoder for Effective User Segmentation in Collaborative Filtering

Yankai Chen,Yifei Zhang,Menglin Yang,Zixing Song,Chen Ma,Irwin King

Maximizing the user-item engagement based on vectorized embeddings is a standard procedure of recent recommender models. Despite the superior performance for item recommendations, these methods however implicitly deprioritize the modeling of user-wise similarity in the embedding space; consequently, identifying similar users is underperforming, and additional processing schemes are usually required otherwise. To avoid thorough model re-training, we propose WSFE, a model-agnostic and training-free representation encoder, to be flexibly employed on the fly for effective user segmentation. Underpinned by the optimal transport theory, the encoded representations from WSFE present a matched user-wise similarity/distance measurement between the realistic and embedding space. We incorporate WSFE into six state-of-the-art recommender models and conduct extensive experiments on six real-world datasets. The empirical analyses well demonstrate the superiority and generality of WSFE to fuel multiple downstream tasks with diverse underlying targets in recommendation.

Learning · 語言模型化 · 秩 · MoDELS · 輸入輸出對 ·

2023 年 5 月 7 日

Unified Demonstration Retriever for In-Context Learning

Xiaonan Li,Kai Lv,Hang Yan,Tianyang Lin,Wei Zhu,Yuan Ni,Guotong Xie,Xiaoling Wang,Xipeng Qiu

from arxiv, To appear at ACL 2023

In-context learning is a new learning paradigm where a language model conditions on a few input-output pairs (demonstrations) and a test input, and directly outputs the prediction. It has been shown highly dependent on the provided demonstrations and thus promotes the research of demonstration retrieval: given a test input, relevant examples are retrieved from the training set to serve as informative demonstrations for in-context learning. While previous works focus on training task-specific retrievers for several tasks separately, these methods are often hard to transfer and scale on various tasks, and separately trained retrievers incur a lot of parameter storage and deployment cost. In this paper, we propose Unified Demonstration Retriever (\textbf{UDR}), a single model to retrieve demonstrations for a wide range of tasks. To train UDR, we cast various tasks' training signals into a unified list-wise ranking formulation by language model's feedback. Then we propose a multi-task list-wise ranking training framework, with an iterative mining strategy to find high-quality candidates, which can help UDR fully incorporate various tasks' signals. Experiments on 30+ tasks across 13 task families and multiple data domains show that UDR significantly outperforms baselines. Further analyses show the effectiveness of each proposed component and UDR's strong ability in various scenarios including different LMs (1.3B - 175B), unseen datasets, varying demonstration quantities, etc.

視覺問答 · INFORMS · 自動問答 · 數據集 · MoDELS ·

2023 年 5 月 7 日

OpenViVQA: Task, Dataset, and Multimodal Fusion Models for Visual Question Answering in Vietnamese

Nghia Hieu Nguyen,Duong T. D. Vo,Kiet Van Nguyen,Ngan Luu-Thuy Nguyen

from arxiv, submitted to Elsevier

In recent years, visual question answering (VQA) has attracted attention from the research community because of its highly potential applications (such as virtual assistance on intelligent cars, assistant devices for blind people, or information retrieval from document images using natural language as queries) and challenge. The VQA task requires methods that have the ability to fuse the information from questions and images to produce appropriate answers. Neural visual question answering models have achieved tremendous growth on large-scale datasets which are mostly for resource-rich languages such as English. However, available datasets narrow the VQA task as the answers selection task or answer classification task. We argue that this form of VQA is far from human ability and eliminates the challenge of the answering aspect in the VQA task by just selecting answers rather than generating them. In this paper, we introduce the OpenViVQA (Open-domain Vietnamese Visual Question Answering) dataset, the first large-scale dataset for VQA with open-ended answers in Vietnamese, consists of 11,000+ images associated with 37,000+ question-answer pairs (QAs). Moreover, we proposed FST, QuMLAG, and MLPAG which fuse information from images and answers, then use these fused features to construct answers as humans iteratively. Our proposed methods achieve results that are competitive with SOTA models such as SAAA, MCAN, LORA, and M4C. The dataset is available to encourage the research community to develop more generalized algorithms including transformers for low-resource languages such as Vietnamese.

Prompt · 語言模型化 · Performer · CoT · MoDELS ·

2023 年 5 月 6 日

Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models

Lei Wang,Wanyu Xu,Yihuai Lan,Zhiqiang Hu,Yunshi Lan,Roy Ka-Wei Lee,Ee-Peng Lim

from arxiv, ACL 2023

Large language models (LLMs) have recently been shown to deliver impressive performance in various NLP tasks. To tackle multi-step reasoning tasks, few-shot chain-of-thought (CoT) prompting includes a few manually crafted step-by-step reasoning demonstrations which enable LLMs to explicitly generate reasoning steps and improve their reasoning task accuracy. To eliminate the manual effort, Zero-shot-CoT concatenates the target problem statement with "Let's think step by step" as an input prompt to LLMs. Despite the success of Zero-shot-CoT, it still suffers from three pitfalls: calculation errors, missing-step errors, and semantic misunderstanding errors. To address the missing-step errors, we propose Plan-and-Solve (PS) Prompting. It consists of two components: first, devising a plan to divide the entire task into smaller subtasks, and then carrying out the subtasks according to the plan. To address the calculation errors and improve the quality of generated reasoning steps, we extend PS prompting with more detailed instructions and derive PS+ prompting. We evaluate our proposed prompting strategy on ten datasets across three reasoning problems. The experimental results over GPT-3 show that our proposed zero-shot prompting consistently outperforms Zero-shot-CoT across all datasets by a large margin, is comparable to or exceeds Zero-shot-Program-of-Thought Prompting, and has comparable performance with 8-shot CoT prompting on the math reasoning problem. The code can be found at //github.com/AGI-Edgerunners/Plan-and-Solve-Prompting.

圖 · MoDELS · 自動問答 · Learning · 表示學習 ·

2023 年 5 月 5 日

Multi-View Graph Representation Learning for Answering Hybrid Numerical Reasoning Question

Yifan Wei,Fangyu Lei,Yuanzhe Zhang,Jun Zhao,Kang Liu

Hybrid question answering (HybridQA) over the financial report contains both textual and tabular data, and requires the model to select the appropriate evidence for the numerical reasoning task. Existing methods based on encoder-decoder framework employ a expression tree-based decoder to solve numerical reasoning problems. However, encoders rely more on Machine Reading Comprehension (MRC) methods, which take table serialization and text splicing as input, damaging the granularity relationship between table and text as well as the spatial structure information of table itself. In order to solve these problems, the paper proposes a Multi-View Graph (MVG) Encoder to take the relations among the granularity into account and capture the relations from multiple view. By utilizing MVGE as a module, we constuct Tabular View, Relation View and Numerical View which aim to retain the original characteristics of the hybrid data. We validate our model on the publicly available table-text hybrid QA benchmark (TAT-QA) and outperform the state-of-the-art model.

視覺問答 · 自動問答 · Extensibility · DATE · 數據集 ·

2021 年 11 月 19 日

Medical Visual Question Answering: A Survey

Zhihong Lin,Donghao Zhang,Qingyi Tac,Danli Shi,Gholamreza Haffari,Qi Wu,Mingguang He,Zongyuan Ge

Medical Visual Question Answering (VQA) is a combination of medical artificial intelligence and popular VQA challenges. Given a medical image and a clinically relevant question in natural language, the medical VQA system is expected to predict a plausible and convincing answer. Although the general-domain VQA has been extensively studied, the medical VQA still needs specific investigation and exploration due to its task features. In the first part of this survey, we cover and discuss the publicly available medical VQA datasets up to date about the data source, data quantity, and task feature. In the second part, we review the approaches used in medical VQA tasks. In the last part, we analyze some medical-specific challenges for the field and discuss future research directions.

估計/估計量 · 圖 · 學成 · 連續優化 · 有向非循環圖 ·

2021 年 11 月 3 日

Multi-task Learning of Order-Consistent Causal Graphs

Xinshi Chen,Haoran Sun,Caleb Ellington,Eric Xing,Le Song

from arxiv, 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

We consider the problem of discovering $K$ related Gaussian directed acyclic graphs (DAGs), where the involved graph structures share a consistent causal order and sparse unions of supports. Under the multi-task learning setting, we propose a $l_1/l_2$-regularized maximum likelihood estimator (MLE) for learning $K$ linear structural equation models. We theoretically show that the joint estimator, by leveraging data across related tasks, can achieve a better sample complexity for recovering the causal order (or topological order) than separate estimations. Moreover, the joint estimator is able to recover non-identifiable DAGs, by estimating them together with some identifiable DAGs. Lastly, our analysis also shows the consistency of union support recovery of the structures. To allow practical implementation, we design a continuous optimization problem whose optimizer is the same as the joint estimator and can be approximated efficiently by an iterative algorithm. We validate the theoretical analysis and the effectiveness of the joint estimator in experiments.

語言模型化 · 自動問答 · MoDELS · 可約的 · entity ·

2021 年 9 月 22 日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Fu Sun,Feng-Lin Li,Ruize Wang,Qianglong Chen,Xingyi Cheng,Ji Zhang

from arxiv, CIKM 2021

Knowledge enhanced pre-trained language models (K-PLMs) are shown to be effective for many public tasks in the literature but few of them have been successfully applied in practice. To address this problem, we propose K-AID, a systematic approach that includes a low-cost knowledge acquisition process for acquiring domain knowledge, an effective knowledge infusion module for improving model performance, and a knowledge distillation component for reducing the model size and deploying K-PLMs on resource-restricted devices (e.g., CPU) for real-world application. Importantly, instead of capturing entity knowledge like the majority of existing K-PLMs, our approach captures relational knowledge, which contributes to better-improving sentence-level text classification and text matching tasks that play a key role in question answering (QA). We conducted a set of experiments on five text classification tasks and three text matching tasks from three domains, namely E-commerce, Government, and Film&TV, and performed online A/B tests in E-commerce. Experimental results show that our approach is able to achieve substantial improvement on sentence-level question answering tasks and bring beneficial business value in industrial settings.

視覺問答 · 數據集 · Performer · state-of-the-art · MoDELS ·

2018 年 3 月 20 日

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Qing Li,Qingyi Tao,Shafiq Joty,Jianfei Cai,Jiebo Luo

Most existing works in visual question answering (VQA) are dedicated to improving the accuracy of predicted answers, while disregarding the explanations. We argue that the explanation for an answer is of the same or even more importance compared with the answer itself, since it makes the question and answering process more understandable and traceable. To this end, we propose a new task of VQA-E (VQA with Explanation), where the computational models are required to generate an explanation with the predicted answer. We first construct a new dataset, and then frame the VQA-E problem in a multi-task learning architecture. Our VQA-E dataset is automatically derived from the VQA v2 dataset by intelligently exploiting the available captions. We have conducted a user study to validate the quality of explanations synthesized by our method. We quantitatively show that the additional supervision from explanations can not only produce insightful textual sentences to justify the answers, but also improve the performance of answer prediction. Our model outperforms the state-of-the-art methods by a clear margin on the VQA v2 dataset.

視覺問答 · 自動問答 · MoDELS · 可辨認的 · 注意力機制 ·

2018 年 2 月 15 日

Learning to Count Objects in Natural Images for Visual Question Answering

Yan Zhang,Jonathon Hare,Adam Prügel-Bennett

from arxiv, Published in ICLR 2018

Visual Question Answering (VQA) models have struggled with counting objects in natural images so far. We identify a fundamental problem due to soft attention in these models as a cause. To circumvent this problem, we propose a neural network component that allows robust counting from object proposals. Experiments on a toy task show the effectiveness of this component and we obtain state-of-the-art accuracy on the number category of the VQA v2 dataset without negatively affecting other categories, even outperforming ensemble models with our single model. On a difficult balanced pair metric, the component gives a substantial improvement in counting over a strong baseline by 6.6%.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tr id='0Ovcv'><strong id='ExNmd'></strong><small id='Ld5qw'></small><button id='wpH41'></button><li id='CJfgF'><noscript id='sd4rz'><big id='uFMs9'></big><dt id='iWPuP'></dt></noscript></li></tr><ol id='78fux'><option id='ovip5'><table id='2hoPw'><blockquote id='zJoHc'><tbody id='AEvWS'></tbody></blockquote></table></option></ol><u id='gBLso'></u><kbd id='JKnn4'><kbd id='FEUSr'></kbd></kbd>

<code id='sSgqQ'><strong id='s7acm'></strong></code>

<fieldset id='hHjLs'></fieldset>

<span id='ZpD02'></span>

<ins id='FiW1P'></ins>

<acronym id='jNWYX'><em id='Rz8xs'></em><td id='i1FZU'><div id='380vD'></div></td></acronym><address id='S4HQV'><big id='AzjY3'><big id='L07qS'></big><legend id='EMzu0'></legend></big></address>

<i id='MUMbS'><div id='qMoDl'><ins id='zuZ4S'></ins></div></i>

<i id='xchay'></i>