Getting relevant information from search engines has been the heart of research works in information retrieval. Query expansion is a retrieval technique that has been studied and proved to yield positive results in relevance. Users are required to express their queries as a shortlist of words, sentences, or questions. With this short format, a huge amount of information is lost in the process of translating the information need from the actual query size since the user cannot convey all his thoughts in a few words. This mostly leads to poor query representation which contributes to undesired retrieval effectiveness. This loss of information has made the study of query expansion technique a strong area of study. This research work focuses on two methods of retrieval for both tweet-length queries and sentence-length queries. Two algorithms have been proposed and the implementation is expected to produce a better relevance retrieval model than most state-the-art relevance models.
Evaluating large summarization corpora using humans has proven to be expensive from both the organizational and the financial perspective. Therefore, many automatic evaluation metrics have been developed to measure the summarization quality in a fast and reproducible way. However, most of the metrics still rely on humans and need gold standard summaries generated by linguistic experts. Since BLANC does not require golden summaries and supposedly can use any underlying language model, we consider its application to the evaluation of summarization in German. This work demonstrates how to adjust the BLANC metric to a language other than English. We compare BLANC scores with the crowd and expert ratings, as well as with commonly used automatic metrics on a German summarization data set. Our results show that BLANC in German is especially good in evaluating informativeness.
Dense retrieval (DR) has the potential to resolve the query understanding challenge in conversational search by matching in the learned embedding space. However, this adaptation is challenging due to DR models' extra needs for supervision signals and the long-tail nature of conversational search. In this paper, we present a Conversational Dense Retrieval system, ConvDR, that learns contextualized embeddings for multi-turn conversational queries and retrieves documents solely using embedding dot products. In addition, we grant ConvDR few-shot ability using a teacher-student framework, where we employ an ad hoc dense retriever as the teacher, inherit its document encodings, and learn a student query encoder to mimic the teacher embeddings on oracle reformulated queries. Our experiments on TREC CAsT and OR-QuAC demonstrate ConvDR's effectiveness in both few-shot and fully-supervised settings. It outperforms previous systems that operate in the sparse word space, matches the retrieval accuracy of oracle query reformulations, and is also more efficient thanks to its simplicity. Our analyses reveal that the advantages of ConvDR come from its ability to capture informative context while ignoring the unrelated context in previous conversation rounds. This makes ConvDR more effective as conversations evolve while previous systems may get confused by the increased noise from previous turns. Our code is publicly available at //github.com/thunlp/ConvDR.
In any ranking system, the retrieval model outputs a single score for a document based on its belief on how relevant it is to a given search query. While retrieval models have continued to improve with the introduction of increasingly complex architectures, few works have investigated a retrieval model's belief in the score beyond the scope of a single value. We argue that capturing the model's uncertainty with respect to its own scoring of a document is a critical aspect of retrieval that allows for greater use of current models across new document distributions, collections, or even improving effectiveness for down-stream tasks. In this paper, we address this problem via an efficient Bayesian framework for retrieval models which captures the model's belief in the relevance score through a stochastic process while adding only negligible computational overhead. We evaluate this belief via a ranking based calibration metric showing that our approximate Bayesian framework significantly improves a retrieval model's ranking effectiveness through a risk aware reranking as well as its confidence calibration. Lastly, we demonstrate that this additional uncertainty information is actionable and reliable on down-stream tasks represented via cutoff prediction.
Recommender systems, which merely leverage user-item interactions for user preference prediction (such as the collaborative filtering-based ones), often face dramatic performance degradation when the interactions of users or items are insufficient. In recent years, various types of side information have been explored to alleviate this problem. Among them, knowledge graph (KG) has attracted extensive research interests as it can encode users/items and their associated attributes in the graph structure to preserve the relation information. In contrast, less attention has been paid to the item-item co-occurrence information (i.e., \textit{co-view}), which contains rich item-item similarity information. It provides information from a perspective different from the user/item-attribute graph and is also valuable for the CF recommendation models. In this work, we make an effort to study the potential of integrating both types of side information (i.e., KG and item-item co-occurrence data) for recommendation. To achieve the goal, we propose a unified graph-based recommendation model (UGRec), which integrates the traditional directed relations in KG and the undirected item-item co-occurrence relations simultaneously. In particular, for a directed relation, we transform the head and tail entities into the corresponding relation space to model their relation; and for an undirected co-occurrence relation, we project head and tail entities into a unique hyperplane in the entity space to minimize their distance. In addition, a head-tail relation-aware attentive mechanism is designed for fine-grained relation modeling. Extensive experiments have been conducted on several publicly accessible datasets to evaluate the proposed model. Results show that our model outperforms several previous state-of-the-art methods and demonstrate the effectiveness of our UGRec model.
Recently, the retrieval models based on dense representations have been gradually applied in the first stage of the document retrieval tasks, showing better performance than traditional sparse vector space models. To obtain high efficiency, the basic structure of these models is Bi-encoder in most cases. However, this simple structure may cause serious information loss during the encoding of documents since the queries are agnostic. To address this problem, we design a method to mimic the queries on each of the documents by an iterative clustering process and represent the documents by multiple pseudo queries (i.e., the cluster centroids). To boost the retrieval process using approximate nearest neighbor search library, we also optimize the matching function with a two-step score calculation procedure. Experimental results on several popular ranking and QA datasets show that our model can achieve state-of-the-art results.
Ranking has always been one of the top concerns in information retrieval researches. For decades, the lexical matching signal has dominated the ad-hoc retrieval process, but solely using this signal in retrieval may cause the vocabulary mismatch problem. In recent years, with the development of representation learning techniques, many researchers turn to Dense Retrieval (DR) models for better ranking performance. Although several existing DR models have already obtained promising results, their performance improvement heavily relies on the sampling of training examples. Many effective sampling strategies are not efficient enough for practical usage, and for most of them, there still lacks theoretical analysis in how and why performance improvement happens. To shed light on these research questions, we theoretically investigate different training strategies for DR models and try to explain why hard negative sampling performs better than random sampling. Through the analysis, we also find that there are many potential risks in static hard negative sampling, which is employed by many existing training methods. Therefore, we propose two training strategies named a Stable Training Algorithm for dense Retrieval (STAR) and a query-side training Algorithm for Directly Optimizing Ranking pErformance (ADORE), respectively. STAR improves the stability of DR training process by introducing random negatives. ADORE replaces the widely-adopted static hard negative sampling method with a dynamic one to directly optimize the ranking performance. Experimental results on two publicly available retrieval benchmark datasets show that either strategy gains significant improvements over existing competitive baselines and a combination of them leads to the best performance.
To retrieve more relevant, appropriate and useful documents given a query, finding clues about that query through the text is crucial. Recent deep learning models regard the task as a term-level matching problem, which seeks exact or similar query patterns in the document. However, we argue that they are inherently based on local interactions and do not generalise to ubiquitous, non-consecutive contextual relationships.In this work, we propose a novel relevance matching model based on graph neural networks to leverage the document-level word relationships for ad-hoc retrieval. In addition to the local interactions, we explicitly incorporate all contexts of a term through the graph-of-word text format. Matching patterns can be revealed accordingly to provide a more accurate relevance score. Our approach significantly outperforms strong baselines on two ad-hoc benchmarks. We also experimentally compare our model with BERT and show our ad-vantages on long documents.
In recent years a vast amount of visual content has been generated and shared from various fields, such as social media platforms, medical images, and robotics. This abundance of content creation and sharing has introduced new challenges. In particular, searching databases for similar content, i.e. content based image retrieval (CBIR), is a long-established research area, and more efficient and accurate methods are needed for real time retrieval. Artificial intelligence has made progress in CBIR and has significantly facilitated the process of intelligent search. In this survey we organize and review recent CBIR works that are developed based on deep learning algorithms and techniques, including insights and techniques from recent papers. We identify and present the commonly-used databases, benchmarks, and evaluation methods used in the field. We collect common challenges and propose promising future directions. More specifically, we focus on image retrieval with deep learning and organize the state of the art methods according to the types of deep network structure, deep features, feature enhancement methods, and network fine-tuning strategies. Our survey considers a wide variety of recent methods, aiming to promote a global view of the field of category-based CBIR.
Although the Transformer translation model (Vaswani et al., 2017) has achieved state-of-the-art performance in a variety of translation tasks, how to use document-level context to deal with discourse phenomena problematic for Transformer still remains a challenge. In this work, we extend the Transformer model with a new context encoder to represent document-level context, which is then incorporated into the original encoder and decoder. As large-scale document-level parallel corpora are usually not available, we introduce a two-step training method to take full advantage of abundant sentence-level parallel corpora and limited document-level parallel corpora. Experiments on the NIST Chinese-English datasets and the IWSLT French-English datasets show that our approach improves over Transformer significantly.
Knowledge graphs contain rich relational structures of the world, and thus complement data-driven machine learning in heterogeneous data. One of the most effective methods in representing knowledge graphs is to embed symbolic relations and entities into continuous spaces, where relations are approximately linear translation between projected images of entities in the relation space. However, state-of-the-art relation projection methods such as TransR, TransD or TransSparse do not model the correlation between relations, and thus are not scalable to complex knowledge graphs with thousands of relations, both in computational demand and in statistical robustness. To this end we introduce TransF, a novel translation-based method which mitigates the burden of relation projection by explicitly modeling the basis subspaces of projection matrices. As a result, TransF is far more light weight than the existing projection methods, and is robust when facing a high number of relations. Experimental results on the canonical link prediction task show that our proposed model outperforms competing rivals by a large margin and achieves state-of-the-art performance. Especially, TransF improves by 9%/5% in the head/tail entity prediction task for N-to-1/1-to-N relations over the best performing translation-based method.