《FlowQA: Grasping Flow in History for Conversational Machine Comprehension.》Hsin-YuanHuang, Eunsol Choi,Wen-tauYih [ICLR] (2019)
會話機器理解需要對會話歷史有深刻的理解,為了使傳統的單圈模型能夠進行全面編碼,作者引入Flow機制,該機制可以通過交替并行處理結構合并在回答先前問題的過程中生成的中間表示。與先前的將問題/答案作為輸入的方法相比,Flow更深入地整合了歷史對話的潛在語義。其性能也優于SCONE中的所有三個領域中的最佳模型,準確性提高了2.6%
Github項目地址://github.com/momohuang/FlowQA
題目: Neural Machine Reading Comprehension:Methods and Trends
摘要: 近年來,隨著深度學習的出現,要求機器根據給定的語境回答問題的機器閱讀理解(MRC)越來越受到廣泛的關注。雖然基于深度學習的MRC研究方興未艾,但目前還缺乏一篇全面的綜述文章來總結本文提出的方法和近期的發展趨勢。因此,我們對這一有希望的領域的最新研究工作進行了全面的綜述。具體來說,我們比較了不同維度的MRC任務,并介紹了其總體架構。我們進一步提供了流行模型中使用的最新方法的分類。最后,我們討論了一些新的趨勢,并通過描述該領域的一些開放性問題得出結論。
論文題目: Introduction to Neural Network based Approaches for Question Answering over Knowledge Graphs
論文摘要: 問答作為一種直觀的查詢結構化數據源的方法已經出現,并在過去幾年中取得了重大進展。在這篇文章中,我們概述了這些最新的進展,重點是基于神經網絡的知識圖問答系統。我們向讀者介紹任務中的挑戰、當前的方法范例,討論顯著的進展,并概述該領域的新趨勢。通過本文,我們的目標是為新進入該領域的人員提供一個合適的切入點,并簡化他們在創建自己的QA系統的同時做出明智決策的過程。
論文題目: Neural Reading Comprehension And Beyond
論文摘要: 教機器理解人類語言文件是人工智能中最難以捉摸和長期存在的挑戰之一。本文研究的是閱讀理解問題:如何建立計算機系統來閱讀一篇文章并回答理解問題。一方面,我們認為閱讀理解是評估計算機系統對人類語言理解程度的重要任務。另一方面,如果我們能建立一個高性能的閱讀理解系統,它們將是問答和對話等應用的關鍵技術系統。在這篇論文中,我們關注的是神經閱讀理解:一類建立在深層神經網絡之上的閱讀理解模型。與傳統的稀疏的、手工設計的基于特征的模型相比,這些端到端的神經模型在學習豐富的語言現象和提高現代閱讀理解基準上的性能方面有了很大的提高。在第一部分中,我們將討論神經的本質閱讀理解和目前我們努力建立有效的神經閱讀理解模型,更重要的是,了解神經閱讀理解模型實際上學到了什么,需要解決語言理解的深度是什么當前任務。我們還總結了這一領域的最新進展,并討論了未來的發展方向和有待解決的問題。在本論文的第二部分,我們將探討如何在最近神經閱讀理解的成功基礎上建立實際的應用。特別是我們開創了兩個新的研究方向:1)如何將信息檢索技術與神經閱讀理解相結合,解決大規模的開放領域問題回答;2)如何從現有的單輪、基于廣域的閱讀理解模型構建會話式問題回答系統。我們在DRQA和coqa項目中實現了這些想法,并證明了這些方法的有效性。我們相信他們對未來的語言技術有很大的希望。
下載鏈接: //stacks.stanford.edu/file/druid:gd576xb1833/thesis-augmented.pdf
This paper focuses on how to take advantage of external relational knowledge to improve machine reading comprehension (MRC) with multi-task learning. Most of the traditional methods in MRC assume that the knowledge used to get the correct answer generally exists in the given documents. However, in real-world task, part of knowledge may not be mentioned and machines should be equipped with the ability to leverage external knowledge. In this paper, we integrate relational knowledge into MRC model for commonsense reasoning. Specifically, based on a pre-trained language model (LM). We design two auxiliary relation-aware tasks to predict if there exists any commonsense relation and what is the relation type between two words, in order to better model the interactions between document and candidate answer option. We conduct experiments on two multi-choice benchmark datasets: the SemEval-2018 Task 11 and the Cloze Story Test. The experimental results demonstrate the effectiveness of the proposed method, which achieves superior performance compared with the comparable baselines on both datasets.
Conversational question answering (CQA) is a novel QA task that requires understanding of dialogue context. Different from traditional single-turn machine reading comprehension (MRC) tasks, CQA includes passage comprehension, coreference resolution, and contextual understanding. In this paper, we propose an innovated contextualized attention-based deep neural network, SDNet, to fuse context into traditional MRC models. Our model leverages both inter-attention and self-attention to comprehend conversation context and extract relevant information from passage. Furthermore, we demonstrated a novel method to integrate the latest BERT contextual model. Empirical results show the effectiveness of our model, which sets the new state of the art result in CoQA leaderboard, outperforming the previous best model by 1.6% F1. Our ensemble model further improves the result by 2.7% F1.
This paper describes a novel hierarchical attention network for reading comprehension style question answering, which aims to answer questions for a given narrative paragraph. In the proposed method, attention and fusion are conducted horizontally and vertically across layers at different levels of granularity between question and paragraph. Specifically, it first encode the question and paragraph with fine-grained language embeddings, to better capture the respective representations at semantic level. Then it proposes a multi-granularity fusion approach to fully fuse information from both global and attended representations. Finally, it introduces a hierarchical attention network to focuses on the answer span progressively with multi-level softalignment. Extensive experiments on the large-scale SQuAD and TriviaQA datasets validate the effectiveness of the proposed method. At the time of writing the paper (Jan. 12th 2018), our model achieves the first position on the SQuAD leaderboard for both single and ensemble models. We also achieves state-of-the-art results on TriviaQA, AddSent and AddOne-Sent datasets.
Machine reading comprehension with unanswerable questions aims to abstain from answering when no answer can be inferred. In addition to extract answers, previous works usually predict an additional "no-answer" probability to detect unanswerable cases. However, they fail to validate the answerability of the question by verifying the legitimacy of the predicted answer. To address this problem, we propose a novel read-then-verify system, which not only utilizes a neural reader to extract candidate answers and produce no-answer probabilities, but also leverages an answer verifier to decide whether the predicted answer is entailed by the input snippets. Moreover, we introduce two auxiliary losses to help the reader better handle answer extraction as well as no-answer detection, and investigate three different architectures for the answer verifier. Our experiments on the SQuAD 2.0 dataset show that our system achieves a score of 74.2 F1 on the test set, achieving state-of-the-art results at the time of submission (Aug. 28th, 2018).
Humans gather information by engaging in conversations involving a series of interconnected questions and answers. For machines to assist in information gathering, it is therefore essential to enable them to answer conversational questions. We introduce CoQA, a novel dataset for building Conversational Question Answering systems. Our dataset contains 127k questions with answers, obtained from 8k conversations about text passages from seven diverse domains. The questions are conversational, and the answers are free-form text with their corresponding evidence highlighted in the passage. We analyze CoQA in depth and show that conversational questions have challenging phenomena not present in existing reading comprehension datasets, e.g., coreference and pragmatic reasoning. We evaluate strong conversational and reading comprehension models on CoQA. The best system obtains an F1 score of 65.1%, which is 23.7 points behind human performance (88.8%), indicating there is ample room for improvement. We launch CoQA as a challenge to the community at //stanfordnlp.github.io/coqa/
We present QuAC, a dataset for Question Answering in Context that contains 14K information-seeking QA dialogs (100K questions in total). The interactions involve two crowd workers: (1) a student who poses a sequence of freeform questions to learn as much as possible about a hidden Wikipedia text, and (2) a teacher who answers the questions by providing short excerpts from the text. QuAC introduces challenges not found in existing machine comprehension datasets: its questions are often more open-ended, unanswerable, or only meaningful within the dialog context, as we show in a detailed qualitative evaluation. We also report results for a number of reference models, including a recently state-of-the-art reading comprehension architecture extended to model dialog context. Our best model underperforms humans by 20 F1, suggesting that there is significant room for future work on this data. Dataset, baseline, and leaderboard are available at quac.ai.
In this paper, we introduce the Reinforced Mnemonic Reader for machine reading comprehension tasks, which enhances previous attentive readers in two aspects. First, a reattention mechanism is proposed to refine current attentions by directly accessing to past attentions that are temporally memorized in a multi-round alignment architecture, so as to avoid the problems of attention redundancy and attention deficiency. Second, a new optimization approach, called dynamic-critical reinforcement learning, is introduced to extend the standard supervised method. It always encourages to predict a more acceptable answer so as to address the convergence suppression problem occurred in traditional reinforcement learning algorithms. Extensive experiments on the Stanford Question Answering Dataset (SQuAD) show that our model achieves state-of-the-art results. Meanwhile, our model outperforms previous systems by over 6% in terms of both Exact Match and F1 metrics on two adversarial SQuAD datasets.