Financial forecasting has been an important and active area of machine learning research, as even the most modest advantage in predictive accuracy can be parlayed into significant financial gains. Recent advances in natural language processing (NLP) bring the opportunity to leverage textual data, such as earnings reports of publicly traded companies, to predict the return rate for an asset. However, when dealing with such a sensitive task, the consistency of models -- their invariance under meaning-preserving alternations in input -- is a crucial property for building user trust. Despite this, current financial forecasting methods do not consider consistency. To address this problem, we propose FinTrust, an evaluation tool that assesses logical consistency in financial text. Using FinTrust, we show that the consistency of state-of-the-art NLP models for financial forecasting is poor. Our analysis of the performance degradation caused by meaning-preserving alternations suggests that current text-based methods are not suitable for robustly predicting market information. All resources are available on GitHub.
Social world knowledge is a key ingredient in effective communication and information processing by humans and machines alike. As of today, there exist many knowledge bases that represent factual world knowledge. Yet, there is no resource that is designed to capture social aspects of world knowledge. We believe that this work makes an important step towards the formulation and construction of such a resource. We introduce SocialVec, a general framework for eliciting low-dimensional entity embeddings from the social contexts in which they occur in social networks. In this framework, entities correspond to highly popular accounts which invoke general interest. We assume that entities that individual users tend to co-follow are socially related, and use this definition of social context to learn the entity embeddings. Similar to word embeddings which facilitate tasks that involve text semantics, we expect the learned social entity embeddings to benefit multiple tasks of social flavor. In this work, we elicited the social embeddings of roughly 200K entities from a sample of 1.3M Twitter users and the accounts that they follow. We employ and gauge the resulting embeddings on two tasks of social importance. First, we assess the political bias of news sources in terms of entity similarity in the social embedding space. Second, we predict the personal traits of individual Twitter users based on the social embeddings of entities that they follow. In both cases, we show advantageous or competitive performance using our approach compared with task-specific baselines. We further show that existing entity embedding schemes, which are fact-based, fail to capture social aspects of knowledge. We make the learned social entity embeddings available to the research community to support further exploration of social world knowledge and its applications.
Food instance segmentation is essential to estimate the serving size of dishes in a food image. The recent cutting-edge techniques for instance segmentation are deep learning networks with impressive segmentation quality and fast computation. Nonetheless, they are hungry for data and expensive for annotation. This paper proposes an incremental learning framework to optimize the model performance given a limited data labelling budget. The power of the framework is a novel difficulty assessment model, which forecasts how challenging an unlabelled sample is to the latest trained instance segmentation model. The data collection procedure is divided into several stages, each in which a new sample package is collected. The framework allocates the labelling budget to the most difficult samples. The unlabelled samples that meet a certain qualification from the assessment model are used to generate pseudo-labels. Eventually, the manual labels and pseudo-labels are sent to the training data to improve the instance segmentation model. On four large-scale food datasets, our proposed framework outperforms current incremental learning benchmarks and achieves competitive performance with the model trained on fully annotated samples.
Causal inference in spatial settings is met with unique challenges and opportunities. In spatial settings, a unit's outcome might be affected by the exposure at many locations and the confounders might be spatially structured. Using causal diagrams, we investigate the complications that arise when investigating causal relationships from spatial data. We illustrate that spatial confounding and interference can manifest as each other, meaning that investigating the presence of one can lead to wrongful conclusions in the presence of the other. We also show that statistical dependencies in the exposure can render standard analyses invalid, which can have crucial implications for understanding the effect of interventions on dependent units. Based on the conclusions from this investigation, we propose a parametric approach that simultaneously accounts for interference and mitigates bias from local and neighborhood unmeasured spatial confounding. We show that incorporating an exposure model is necessary from a Bayesian perspective. Therefore, the proposed approach is based on modeling the exposure and the outcome simultaneously while accounting for the presence of common spatially-structured unmeasured predictors. We illustrate our approach with a simulation study and with an analysis of the local and interference effects of sulfur dioxide emissions from power plants on cardiovascular mortality.
The rapid changes in the finance industry due to the increasing amount of data have revolutionized the techniques on data processing and data analysis and brought new theoretical and computational challenges. In contrast to classical stochastic control theory and other analytical approaches for solving financial decision-making problems that heavily reply on model assumptions, new developments from reinforcement learning (RL) are able to make full use of the large amount of financial data with fewer model assumptions and to improve decisions in complex financial environments. This survey paper aims to review the recent developments and use of RL approaches in finance. We give an introduction to Markov decision processes, which is the setting for many of the commonly used RL approaches. Various algorithms are then introduced with a focus on value and policy based methods that do not require any model assumptions. Connections are made with neural networks to extend the framework to encompass deep RL algorithms. Our survey concludes by discussing the application of these RL algorithms in a variety of decision-making problems in finance, including optimal execution, portfolio optimization, option pricing and hedging, market making, smart order routing, and robo-advising.
Traffic forecasting is an important factor for the success of intelligent transportation systems. Deep learning models including convolution neural networks and recurrent neural networks have been applied in traffic forecasting problems to model the spatial and temporal dependencies. In recent years, to model the graph structures in the transportation systems as well as the contextual information, graph neural networks (GNNs) are introduced as new tools and have achieved the state-of-the-art performance in a series of traffic forecasting problems. In this survey, we review the rapidly growing body of recent research using different GNNs, e.g., graph convolutional and graph attention networks, in various traffic forecasting problems, e.g., road traffic flow and speed forecasting, passenger flow forecasting in urban rail transit systems, demand forecasting in ride-hailing platforms, etc. We also present a collection of open data and source resources for each problem, as well as future research directions. To the best of our knowledge, this paper is the first comprehensive survey that explores the application of graph neural networks for traffic forecasting problems. We have also created a public Github repository to update the latest papers, open data and source resources.
Many real-world applications require the prediction of long sequence time-series, such as electricity consumption planning. Long sequence time-series forecasting (LSTF) demands a high prediction capacity of the model, which is the ability to capture precise long-range dependency coupling between output and input efficiently. Recent studies have shown the potential of Transformer to increase the prediction capacity. However, there are several severe issues with Transformer that prevent it from being directly applicable to LSTF, such as quadratic time complexity, high memory usage, and inherent limitation of the encoder-decoder architecture. To address these issues, we design an efficient transformer-based model for LSTF, named Informer, with three distinctive characteristics: (i) a $ProbSparse$ Self-attention mechanism, which achieves $O(L \log L)$ in time complexity and memory usage, and has comparable performance on sequences' dependency alignment. (ii) the self-attention distilling highlights dominating attention by halving cascading layer input, and efficiently handles extreme long input sequences. (iii) the generative style decoder, while conceptually simple, predicts the long time-series sequences at one forward operation rather than a step-by-step way, which drastically improves the inference speed of long-sequence predictions. Extensive experiments on four large-scale datasets demonstrate that Informer significantly outperforms existing methods and provides a new solution to the LSTF problem.
Modeling multivariate time series has long been a subject that has attracted researchers from a diverse range of fields including economics, finance, and traffic. A basic assumption behind multivariate time series forecasting is that its variables depend on one another but, upon looking closely, it is fair to say that existing methods fail to fully exploit latent spatial dependencies between pairs of variables. In recent years, meanwhile, graph neural networks (GNNs) have shown high capability in handling relational dependencies. GNNs require well-defined graph structures for information propagation which means they cannot be applied directly for multivariate time series where the dependencies are not known in advance. In this paper, we propose a general graph neural network framework designed specifically for multivariate time series data. Our approach automatically extracts the uni-directed relations among variables through a graph learning module, into which external knowledge like variable attributes can be easily integrated. A novel mix-hop propagation layer and a dilated inception layer are further proposed to capture the spatial and temporal dependencies within the time series. The graph learning, graph convolution, and temporal convolution modules are jointly learned in an end-to-end framework. Experimental results show that our proposed model outperforms the state-of-the-art baseline methods on 3 of 4 benchmark datasets and achieves on-par performance with other approaches on two traffic datasets which provide extra structural information.
Knowledge graph (KG) embedding encodes the entities and relations from a KG into low-dimensional vector spaces to support various applications such as KG completion, question answering, and recommender systems. In real world, knowledge graphs (KGs) are dynamic and evolve over time with addition or deletion of triples. However, most existing models focus on embedding static KGs while neglecting dynamics. To adapt to the changes in a KG, these models need to be re-trained on the whole KG with a high time cost. In this paper, to tackle the aforementioned problem, we propose a new context-aware Dynamic Knowledge Graph Embedding (DKGE) method which supports the embedding learning in an online fashion. DKGE introduces two different representations (i.e., knowledge embedding and contextual element embedding) for each entity and each relation, in the joint modeling of entities and relations as well as their contexts, by employing two attentive graph convolutional networks, a gate strategy, and translation operations. This effectively helps limit the impacts of a KG update in certain regions, not in the entire graph, so that DKGE can rapidly acquire the updated KG embedding by a proposed online learning algorithm. Furthermore, DKGE can also learn KG embedding from scratch. Experiments on the tasks of link prediction and question answering in a dynamic environment demonstrate the effectiveness and efficiency of DKGE.
In recent years, DBpedia, Freebase, OpenCyc, Wikidata, and YAGO have been published as noteworthy large, cross-domain, and freely available knowledge graphs. Although extensively in use, these knowledge graphs are hard to compare against each other in a given setting. Thus, it is a challenge for researchers and developers to pick the best knowledge graph for their individual needs. In our recent survey, we devised and applied data quality criteria to the above-mentioned knowledge graphs. Furthermore, we proposed a framework for finding the most suitable knowledge graph for a given setting. With this paper we intend to ease the access to our in-depth survey by presenting simplified rules that map individual data quality requirements to specific knowledge graphs. However, this paper does not intend to replace our previously introduced decision-support framework. For an informed decision on which KG is best for you we still refer to our in-depth survey.
Multivariate time series forecasting is extensively studied throughout the years with ubiquitous applications in areas such as finance, traffic, environment, etc. Still, concerns have been raised on traditional methods for incapable of modeling complex patterns or dependencies lying in real word data. To address such concerns, various deep learning models, mainly Recurrent Neural Network (RNN) based methods, are proposed. Nevertheless, capturing extremely long-term patterns while effectively incorporating information from other variables remains a challenge for time-series forecasting. Furthermore, lack-of-explainability remains one serious drawback for deep neural network models. Inspired by Memory Network proposed for solving the question-answering task, we propose a deep learning based model named Memory Time-series network (MTNet) for time series forecasting. MTNet consists of a large memory component, three separate encoders, and an autoregressive component to train jointly. Additionally, the attention mechanism designed enable MTNet to be highly interpretable. We can easily tell which part of the historic data is referenced the most.