We address a three-tier numerical framework based on manifold learning for the forecasting of high-dimensional time series. At the first step, we embed the time series into a reduced low-dimensional space using a nonlinear manifold learning algorithm such as Locally Linear Embedding and Diffusion Maps. At the second step, we construct reduced-order regression models on the manifold, in particular Multivariate Autoregressive (MVAR) and Gaussian Process Regression (GPR) models, to forecast the embedded dynamics. At the final step, we lift the embedded time series back to the original high-dimensional space using Radial Basis Functions interpolation and Geometric Harmonics. For our illustrations, we test the forecasting performance of the proposed numerical scheme with four sets of time series: three synthetic stochastic ones resembling EEG signals produced from linear and nonlinear stochastic models with different model orders, and one real-world data set containing daily time series of 10 key foreign exchange rates (FOREX) spanning the time period 19/09/2001-29/10/2020. The forecasting performance of the proposed numerical scheme is assessed using the combinations of manifold learning, modelling and lifting approaches. We also provide a comparison with the Principal Component Analysis algorithm as well as with the naive random walk model and the MVAR and GPR models trained and implemented directly in the high-dimensional space.
A novel framework of intelligent reflecting surface (IRS)-aided multiple-input single-output (MISO) non-orthogonal multiple access (NOMA) network is proposed, where a base station (BS) serves multiple clusters with unfixed number of users in each cluster. The goal is to maximize the sum rate of all users by jointly optimizing the passive beamforming vector at the IRS, decoding order, power allocation coefficient vector and number of clusters, subject to the rate requirements of users. In order to tackle the formulated problem, a three-step approach is proposed. More particularly, a long short-term memory (LSTM) based algorithm is first adopted for predicting the mobility of users. Secondly, a K-means based Gaussian mixture model (K-GMM) algorithm is proposed for user clustering. Thirdly, a deep Q-network (DQN) based algorithm is invoked for jointly determining the phase shift matrix and power allocation policy. Simulation results are provided for demonstrating that the proposed algorithm outperforms the benchmarks, while the throughput gain of 35% can be achieved by invoking NOMA technique instead of orthogonal multiple access (OMA).
This paper addresses a problem called "conditional manifold learning", which aims to learn a low-dimensional manifold embedding of high-dimensional data, conditioning on auxiliary manifold information. This auxiliary manifold information is from controllable or measurable conditions, which are ubiquitous in many science and engineering applications. A broad class of solutions for this problem, conditional multidimensional scaling (including a conditional ISOMAP variant), is proposed. A conditional version of the SMACOF algorithm is introduced to optimize the objective function of conditional multidimensional scaling.
With the fast development of modern deep learning techniques, the study of dynamic systems and neural networks is increasingly benefiting each other in a lot of different ways. Since uncertainties often arise in real world observations, SDEs (stochastic differential equations) come to play an important role. To be more specific, in this paper, we use a collection of SDEs equipped with neural networks to predict long-term trend of noisy time series which has big jump properties and high probability distribution shift. Our contributions are, first, we use the phase space reconstruction method to extract intrinsic dimension of the time series data so as to determine the input structure for our forecasting model. Second, we explore SDEs driven by $\alpha$-stable L\'evy motion to model the time series data and solve the problem through neural network approximation. Third, we construct the attention mechanism to achieve multi-time step prediction. Finally, we illustrate our method by applying it to stock marketing time series prediction and show the results outperform several baseline deep learning models.
Time series forecasting is widely used in business intelligence, e.g., forecast stock market price, sales, and help the analysis of data trend. Most time series of interest are macroscopic time series that are aggregated from microscopic data. However, instead of directly modeling the macroscopic time series, rare literature studied the forecasting of macroscopic time series by leveraging data on the microscopic level. In this paper, we assume that the microscopic time series follow some unknown mixture probabilistic distributions. We theoretically show that as we identify the ground truth latent mixture components, the estimation of time series from each component could be improved because of lower variance, thus benefitting the estimation of macroscopic time series as well. Inspired by the power of Seq2seq and its variants on the modeling of time series data, we propose Mixture of Seq2seq (MixSeq), an end2end mixture model to cluster microscopic time series, where all the components come from a family of Seq2seq models parameterized by different parameters. Extensive experiments on both synthetic and real-world data show the superiority of our approach.
Spatio-temporal forecasting has numerous applications in analyzing wireless, traffic, and financial networks. Many classical statistical models often fall short in handling the complexity and high non-linearity present in time-series data. Recent advances in deep learning allow for better modelling of spatial and temporal dependencies. While most of these models focus on obtaining accurate point forecasts, they do not characterize the prediction uncertainty. In this work, we consider the time-series data as a random realization from a nonlinear state-space model and target Bayesian inference of the hidden states for probabilistic forecasting. We use particle flow as the tool for approximating the posterior distribution of the states, as it is shown to be highly effective in complex, high-dimensional settings. Thorough experimentation on several real world time-series datasets demonstrates that our approach provides better characterization of uncertainty while maintaining comparable accuracy to the state-of-the art point forecasting methods.
Minimizing cross-entropy over the softmax scores of a linear map composed with a high-capacity encoder is arguably the most popular choice for training neural networks on supervised learning tasks. However, recent works show that one can directly optimize the encoder instead, to obtain equally (or even more) discriminative representations via a supervised variant of a contrastive objective. In this work, we address the question whether there are fundamental differences in the sought-for representation geometry in the output space of the encoder at minimal loss. Specifically, we prove, under mild assumptions, that both losses attain their minimum once the representations of each class collapse to the vertices of a regular simplex, inscribed in a hypersphere. We provide empirical evidence that this configuration is attained in practice and that reaching a close-to-optimal state typically indicates good generalization performance. Yet, the two losses show remarkably different optimization behavior. The number of iterations required to perfectly fit to data scales superlinearly with the amount of randomly flipped labels for the supervised contrastive loss. This is in contrast to the approximately linear scaling previously reported for networks trained with cross-entropy.
Modeling multivariate time series has long been a subject that has attracted researchers from a diverse range of fields including economics, finance, and traffic. A basic assumption behind multivariate time series forecasting is that its variables depend on one another but, upon looking closely, it is fair to say that existing methods fail to fully exploit latent spatial dependencies between pairs of variables. In recent years, meanwhile, graph neural networks (GNNs) have shown high capability in handling relational dependencies. GNNs require well-defined graph structures for information propagation which means they cannot be applied directly for multivariate time series where the dependencies are not known in advance. In this paper, we propose a general graph neural network framework designed specifically for multivariate time series data. Our approach automatically extracts the uni-directed relations among variables through a graph learning module, into which external knowledge like variable attributes can be easily integrated. A novel mix-hop propagation layer and a dilated inception layer are further proposed to capture the spatial and temporal dependencies within the time series. The graph learning, graph convolution, and temporal convolution modules are jointly learned in an end-to-end framework. Experimental results show that our proposed model outperforms the state-of-the-art baseline methods on 3 of 4 benchmark datasets and achieves on-par performance with other approaches on two traffic datasets which provide extra structural information.
This paper addresses the difficulty of forecasting multiple financial time series (TS) conjointly using deep neural networks (DNN). We investigate whether DNN-based models could forecast these TS more efficiently by learning their representation directly. To this end, we make use of the dynamic factor graph (DFG) from that we enhance by proposing a novel variable-length attention-based mechanism to render it memory-augmented. Using this mechanism, we propose an unsupervised DNN architecture for multivariate TS forecasting that allows to learn and take advantage of the relationships between these TS. We test our model on two datasets covering 19 years of investment funds activities. Our experimental results show that our proposed approach outperforms significantly typical DNN-based and statistical models at forecasting their 21-day price trajectory.
Multivariate time series forecasting is extensively studied throughout the years with ubiquitous applications in areas such as finance, traffic, environment, etc. Still, concerns have been raised on traditional methods for incapable of modeling complex patterns or dependencies lying in real word data. To address such concerns, various deep learning models, mainly Recurrent Neural Network (RNN) based methods, are proposed. Nevertheless, capturing extremely long-term patterns while effectively incorporating information from other variables remains a challenge for time-series forecasting. Furthermore, lack-of-explainability remains one serious drawback for deep neural network models. Inspired by Memory Network proposed for solving the question-answering task, we propose a deep learning based model named Memory Time-series network (MTNet) for time series forecasting. MTNet consists of a large memory component, three separate encoders, and an autoregressive component to train jointly. Additionally, the attention mechanism designed enable MTNet to be highly interpretable. We can easily tell which part of the historic data is referenced the most.
Similarity/Distance measures play a key role in many machine learning, pattern recognition, and data mining algorithms, which leads to the emergence of metric learning field. Many metric learning algorithms learn a global distance function from data that satisfy the constraints of the problem. However, in many real-world datasets that the discrimination power of features varies in the different regions of input space, a global metric is often unable to capture the complexity of the task. To address this challenge, local metric learning methods are proposed that learn multiple metrics across the different regions of input space. Some advantages of these methods are high flexibility and the ability to learn a nonlinear mapping but typically achieves at the expense of higher time requirement and overfitting problem. To overcome these challenges, this research presents an online multiple metric learning framework. Each metric in the proposed framework is composed of a global and a local component learned simultaneously. Adding a global component to a local metric efficiently reduce the problem of overfitting. The proposed framework is also scalable with both sample size and the dimension of input data. To the best of our knowledge, this is the first local online similarity/distance learning framework based on PA (Passive/Aggressive). In addition, for scalability with the dimension of input data, DRP (Dual Random Projection) is extended for local online learning in the present work. It enables our methods to be run efficiently on high-dimensional datasets, while maintains their predictive performance. The proposed framework provides a straightforward local extension to any global online similarity/distance learning algorithm based on PA.