In this article, we study algorithms for nonnegative matrix factorization (NMF) in various applications involving streaming data. Utilizing the continual nature of the data, we develop a fast two-stage algorithm for highly efficient and accurate NMF. In the first stage, an alternating non-negative least squares (ANLS) framework is used, in combination with active set method with warm-start strategy for the solution of subproblems. In the second stage, an interior point method is adopted to accelerate the local convergence. The convergence of the proposed algorithm is proved. The new algorithm is compared with some existing algorithms in benchmark tests using both real-world data and synthetic data. The results demonstrate the advantage of our algorithm in finding high-precision solutions.
In this paper we present improved bounds for approximating maximum matchings in bipartite graphs in the streaming model. First, we consider the question of how well maximum matching can be approximated in a single pass over the input using $\tilde O(n)$ space, where $n$ is the number of vertices in the input graph. Two natural variants of this problem have been considered in the literature: (1) the edge arrival setting, where edges arrive in the stream and (2) the vertex arrival setting, where vertices on one side of the graph arrive in the stream together with all their incident edges. The latter setting has also been studied extensively in the context of online algorithms, where each arriving vertex has to either be matched irrevocably or discarded upon arrival. In the online setting, the celebrated algorithm of Karp-Vazirani-Vazirani achieves a $1-1/e$ approximation. Despite the fact that the streaming model is less restrictive in that the algorithm is not constrained to match vertices irrevocably upon arrival, the best known approximation in the streaming model with vertex arrivals and $\tilde O(n)$ space is the same factor of $1-1/e$. We show that no single pass streaming algorithm that uses $\tilde O(n)$ space can achieve a better than $1-1/e$ approximation to maximum matching, even in the vertex arrival setting. This leads to the striking conclusion that no single pass streaming algorithm can do better than online algorithms unless it uses significantly more than $\tilde O(n)$ space. Additionally, our bound yields the best known impossibility result for approximating matchings in the edge arrival model. We also give a simple algorithm that achieves approximation ratio $1-e^{-k}k^{k-1}/(k-1)!=1-\frac1{\sqrt{2\pi k}}+o(1/k)$ in $k$ passes in the vertex arrival model using linear space, improving upon previously best known convergence.
We propose a contemporaneous bilinear transformation for matrix time series to alleviate the difficulties in modelling and forecasting large number of time series together. More precisely the transformed matrix splits into several small matrices, and those small matrix series are uncorrelated across all times. Hence an effective dimension-reduction is achieved by modelling each of those small matrix series separately without the loss of information on the overall linear dynamics. We adopt the bilinear transformation such that the rows and the columns of the matrix do not mix together, as they typically represent radically different features. As the targeted transformation is not unique, we identify an ideal version through a new normalization, which facilitates the no-cancellation accumulation of the information from different time lags. The non-asymptotic error bounds of the estimated transformation are derived, leading to the uniform convergence rates of the estimation. The proposed method is illustrated numerically via both simulated and real data examples.
In this article, we introduce a two-way factor model for a high-dimensional data matrix and study the properties of the maximum likelihood estimation (MLE). The proposed model assumes separable effects of row and column attributes and captures the correlation across rows and columns with low-dimensional hidden factors. The model inherits the dimension-reduction feature of classical factor models but introduces a new framework with separable row and column factors, representing the covariance or correlation structure in the data matrix. We propose a block alternating, maximizing strategy to compute the MLE of factor loadings as well as other model parameters. We discuss model identifiability, obtain consistency and the asymptotic distribution for the MLE as the numbers of rows and columns in the data matrix increase. One interesting phenomenon that we learned from our analysis is that the variance of the estimates in the two-way factor model depends on the distance of variances of row factors and column factors in a way that is not expected in classical factor analysis. We further demonstrate the performance of the proposed method through simulation and real data analysis.
Matrices with low numerical rank are omnipresent in many signal processing and data analysis applications. The pivoted QLP (p-QLP) algorithm constructs a highly accurate approximation to an input low-rank matrix. However, it is computationally prohibitive for large matrices. In this paper, we introduce a new algorithm termed Projection-based Partial QLP (PbP-QLP) that efficiently approximates the p-QLP with high accuracy. Fundamental in our work is the exploitation of randomization and in contrast to the p-QLP, PbP-QLP does not use the pivoting strategy. As such, PbP-QLP can harness modern computer architectures, even better than competing randomized algorithms. The efficiency and effectiveness of our proposed PbP-QLP algorithm are investigated through various classes of synthetic and real-world data matrices.
Industrial Internet of Things (IIoT) revolutionizes the future manufacturing facilities by integrating the Internet of Things technologies into industrial settings. With the deployment of massive IIoT devices, it is difficult for the wireless network to support the ubiquitous connections with diverse quality-of-service (QoS) requirements. Although machine learning is regarded as a powerful data-driven tool to optimize wireless network, how to apply machine learning to deal with the massive IIoT problems with unique characteristics remains unsolved. In this paper, we first summarize the QoS requirements of the typical massive non-critical and critical IIoT use cases. We then identify unique characteristics in the massive IIoT scenario, and the corresponding machine learning solutions with its limitations and potential research directions. We further present the existing machine learning solutions for individual layer and cross-layer problems in massive IIoT. Last but not the least, we present a case study of massive access problem based on deep neural network and deep reinforcement learning techniques, respectively, to validate the effectiveness of machine learning in massive IIoT scenario.
Click-through rate (CTR) prediction plays a critical role in recommender systems and online advertising. The data used in these applications are multi-field categorical data, where each feature belongs to one field. Field information is proved to be important and there are several works considering fields in their models. In this paper, we proposed a novel approach to model the field information effectively and efficiently. The proposed approach is a direct improvement of FwFM, and is named as Field-matrixed Factorization Machines (FmFM, or $FM^2$). We also proposed a new explanation of FM and FwFM within the FmFM framework, and compared it with the FFM. Besides pruning the cross terms, our model supports field-specific variable dimensions of embedding vectors, which acts as soft pruning. We also proposed an efficient way to minimize the dimension while keeping the model performance. The FmFM model can also be optimized further by caching the intermediate vectors, and it only takes thousands of floating-point operations (FLOPs) to make a prediction. Our experiment results show that it can out-perform the FFM, which is more complex. The FmFM model's performance is also comparable to DNN models which require much more FLOPs in runtime.
Although Recommender Systems have been comprehensively studied in the past decade both in industry and academia, most of current recommender systems suffer from the fol- lowing issues: 1) The data sparsity of the user-item matrix seriously affect the recommender system quality. As a result, most of traditional recommender system approaches are not able to deal with the users who have rated few items, which is known as cold start problem in recommender system. 2) Traditional recommender systems assume that users are in- dependently and identically distributed and ignore the social relation between users. However, in real life scenario, due to the exponential growth of social networking service, such as facebook and Twitter, social connections between different users play an significant role for recommender system task. In this work, aiming at providing a better recommender sys- tems by incorporating user social network information, we propose a matrix factorization framework with user social connection constraints. Experimental results on the real-life dataset shows that the proposed method performs signifi- cantly better than the state-of-the-art approaches in terms of MAE and RMSE, especially for the cold start users.
Review-based recommender systems have gained noticeable ground in recent years. In addition to the rating scores, those systems are enriched with textual evaluations of items by the users. Neural language processing models, on the other hand, have already found application in recommender systems, mainly as a means of encoding user preference data, with the actual textual description of items serving only as side information. In this paper, a novel approach to incorporating the aforementioned models into the recommendation process is presented. Initially, a neural language processing model and more specifically the paragraph vector model is used to encode textual user reviews of variable length into feature vectors of fixed length. Subsequently this information is fused along with the rating scores in a probabilistic matrix factorization algorithm, based on maximum a-posteriori estimation. The resulting system, ParVecMF, is compared to a ratings' matrix factorization approach on a reference dataset. The obtained preliminary results on a set of two metrics are encouraging and may stimulate further research in this area.
We introduce negative binomial matrix factorization (NBMF), a matrix factorization technique specially designed for analyzing over-dispersed count data. It can be viewed as an extension of Poisson matrix factorization (PF) perturbed by a multiplicative term which models exposure. This term brings a degree of freedom for controlling the dispersion, making NBMF more robust to outliers. We show that NBMF allows to skip traditional pre-processing stages, such as binarization, which lead to loss of information. Two estimation approaches are presented: maximum likelihood and variational Bayes inference. We test our model with a recommendation task and show its ability to predict user tastes with better precision than PF.
Since the invention of word2vec, the skip-gram model has significantly advanced the research of network embedding, such as the recent emergence of the DeepWalk, LINE, PTE, and node2vec approaches. In this work, we show that all of the aforementioned models with negative sampling can be unified into the matrix factorization framework with closed forms. Our analysis and proofs reveal that: (1) DeepWalk empirically produces a low-rank transformation of a network's normalized Laplacian matrix; (2) LINE, in theory, is a special case of DeepWalk when the size of vertices' context is set to one; (3) As an extension of LINE, PTE can be viewed as the joint factorization of multiple networks' Laplacians; (4) node2vec is factorizing a matrix related to the stationary distribution and transition probability tensor of a 2nd-order random walk. We further provide the theoretical connections between skip-gram based network embedding algorithms and the theory of graph Laplacian. Finally, we present the NetMF method as well as its approximation algorithm for computing network embedding. Our method offers significant improvements over DeepWalk and LINE for conventional network mining tasks. This work lays the theoretical foundation for skip-gram based network embedding methods, leading to a better understanding of latent network representation learning.