亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Since the invention of word2vec, the skip-gram model has significantly advanced the research of network embedding, such as the recent emergence of the DeepWalk, LINE, PTE, and node2vec approaches. In this work, we show that all of the aforementioned models with negative sampling can be unified into the matrix factorization framework with closed forms. Our analysis and proofs reveal that: (1) DeepWalk empirically produces a low-rank transformation of a network's normalized Laplacian matrix; (2) LINE, in theory, is a special case of DeepWalk when the size of vertices' context is set to one; (3) As an extension of LINE, PTE can be viewed as the joint factorization of multiple networks' Laplacians; (4) node2vec is factorizing a matrix related to the stationary distribution and transition probability tensor of a 2nd-order random walk. We further provide the theoretical connections between skip-gram based network embedding algorithms and the theory of graph Laplacian. Finally, we present the NetMF method as well as its approximation algorithm for computing network embedding. Our method offers significant improvements over DeepWalk and LINE for conventional network mining tasks. This work lays the theoretical foundation for skip-gram based network embedding methods, leading to a better understanding of latent network representation learning.

相關內容

DeepWalk是最早提出的基于 Word2vec 的節點向量化模型。其主要思路,就是利用構造節點在網絡上的隨機游走路徑,來模仿文本生成的過程,提供一個節點序列,然后用Skip-gram和Hierarchical Softmax模型對隨機游走序列中每個局部窗口內的節點對進行概率建模,最大化隨機游走序列的似然概率,并使用最終隨機梯度下降學習參數。

Network representation learning in low dimensional vector space has attracted considerable attention in both academic and industrial domains. Most real-world networks are dynamic with addition/deletion of nodes and edges. The existing graph embedding methods are designed for static networks and they cannot capture evolving patterns in a large dynamic network. In this paper, we propose a dynamic embedding method, dynnode2vec, based on the well-known graph embedding method node2vec. Node2vec is a random walk based embedding method for static networks. Applying static network embedding in dynamic settings has two crucial problems: 1) Generating random walks for every time step is time consuming 2) Embedding vector spaces in each timestamp are different. In order to tackle these challenges, dynnode2vec uses evolving random walks and initializes the current graph embedding with previous embedding vectors. We demonstrate the advantages of the proposed dynamic network embedding by conducting empirical evaluations on several large dynamic network datasets.

Network embedding has attracted considerable research attention recently. However, the existing methods are incapable of handling billion-scale networks, because they are computationally expensive and, at the same time, difficult to be accelerated by distributed computing schemes. To address these problems, we propose RandNE, a novel and simple billion-scale network embedding method. Specifically, we propose a Gaussian random projection approach to map the network into a low-dimensional embedding space while preserving the high-order proximities between nodes. To reduce the time complexity, we design an iterative projection procedure to avoid the explicit calculation of the high-order proximities. Theoretical analysis shows that our method is extremely efficient, and friendly to distributed computing schemes without any communication cost in the calculation. We demonstrate the efficacy of RandNE over state-of-the-art methods in network reconstruction and link prediction tasks on multiple datasets with different scales, ranging from thousands to billions of nodes and edges.

Adding attributes for nodes to network embedding helps to improve the ability of the learned joint representation to depict features from topology and attributes simultaneously. Recent research on the joint embedding has exhibited a promising performance on a variety of tasks by jointly embedding the two spaces. However, due to the indispensable requirement of globality based information, present approaches contain a flaw of in-scalability. Here we propose \emph{SANE}, a scalable attribute-aware network embedding algorithm with locality, to learn the joint representation from topology and attributes. By enforcing the alignment of a local linear relationship between each node and its K-nearest neighbors in topology and attribute space, the joint embedding representations are more informative comparing with a single representation from topology or attributes alone. And we argue that the locality in \emph{SANE} is the key to learning the joint representation at scale. By using several real-world networks from diverse domains, We demonstrate the efficacy of \emph{SANE} in performance and scalability aspect. Overall, for performance on label classification, SANE successfully reaches up to the highest F1-score on most datasets, and even closer to the baseline method that needs label information as extra inputs, compared with other state-of-the-art joint representation algorithms. What's more, \emph{SANE} has an up to 71.4\% performance gain compared with the single topology-based algorithm. For scalability, we have demonstrated the linearly time complexity of \emph{SANE}. In addition, we intuitively observe that when the network size scales to 100,000 nodes, the "learning joint embedding" step of \emph{SANE} only takes $\approx10$ seconds.

Network embedding methodologies, which learn a distributed vector representation for each vertex in a network, have attracted considerable interest in recent years. Existing works have demonstrated that vertex representation learned through an embedding method provides superior performance in many real-world applications, such as node classification, link prediction, and community detection. However, most of the existing methods for network embedding only utilize topological information of a vertex, ignoring a rich set of nodal attributes (such as, user profiles of an online social network, or textual contents of a citation network), which is abundant in all real-life networks. A joint network embedding that takes into account both attributional and relational information entails a complete network information and could further enrich the learned vector representations. In this work, we present Neural-Brane, a novel Neural Bayesian Personalized Ranking based Attributed Network Embedding. For a given network, Neural-Brane extracts latent feature representation of its vertices using a designed neural network model that unifies network topological information and nodal attributes; Besides, it utilizes Bayesian personalized ranking objective, which exploits the proximity ordering between a similar node-pair and a dissimilar node-pair. We evaluate the quality of vertex embedding produced by Neural-Brane by solving the node classification and clustering tasks on four real-world datasets. Experimental results demonstrate the superiority of our proposed method over the state-of-the-art existing methods.

Adding attributes for nodes to network embedding helps to improve the ability of the learned joint representation to depict features from topology and attributes simultaneously. Recent research on the joint embedding has exhibited a promising performance on a variety of tasks by jointly embedding the two spaces. However, due to the indispensable requirement of globality based information, present approaches contain a flaw of in-scalability. Here we propose \emph{SANE}, a scalable attribute-aware network embedding algorithm with locality, to learn the joint representation from topology and attributes. By enforcing the alignment of a local linear relationship between each node and its K-nearest neighbors in topology and attribute space, the joint embedding representations are more informative comparing with a single representation from topology or attributes alone. And we argue that the locality in \emph{SANE} is the key to learning the joint representation at scale. By using several real-world networks from diverse domains, We demonstrate the efficacy of \emph{SANE} in performance and scalability aspect. Overall, for performance on label classification, SANE successfully reaches up to the highest F1-score on most datasets, and even closer to the baseline method that needs label information as extra inputs, compared with other state-of-the-art joint representation algorithms. What's more, \emph{SANE} has an up to 71.4\% performance gain compared with the single topology-based algorithm. For scalability, we have demonstrated the linearly time complexity of \emph{SANE}. In addition, we intuitively observe that when the network size scales to 100,000 nodes, the "learning joint embedding" step of \emph{SANE} only takes $\approx10$ seconds.

Recent advances in the field of network embedding have shown the low-dimensional network representation is playing a critical role in network analysis. However, most of the existing principles of network embedding do not incorporate auxiliary information such as content and labels of nodes flexibly. In this paper, we take a matrix factorization perspective of network embedding, and incorporate structure, content and label information of the network simultaneously. For structure, we validate that the matrix we construct preserves high-order proximities of the network. Label information can be further integrated into the matrix via the process of random walk sampling to enhance the quality of embedding in an unsupervised manner, i.e., without leveraging downstream classifiers. In addition, we generalize the Skip-Gram Negative Sampling model to integrate the content of the network in a matrix factorization framework. As a consequence, network embedding can be learned in a unified framework integrating network structure and node content as well as label information simultaneously. We demonstrate the efficacy of the proposed model with the tasks of semi-supervised node classification and link prediction on a variety of real-world benchmark network datasets.

Social network analysis provides meaningful information about behavior of network members that can be used for diverse applications such as classification, link prediction. However, network analysis is computationally expensive because of feature learning for different applications. In recent years, many researches have focused on feature learning methods in social networks. Network embedding represents the network in a lower dimensional representation space with the same properties which presents a compressed representation of the network. In this paper, we introduce a novel algorithm named "CARE" for network embedding that can be used for different types of networks including weighted, directed and complex. Current methods try to preserve local neighborhood information of nodes, whereas the proposed method utilizes local neighborhood and community information of network nodes to cover both local and global structure of social networks. CARE builds customized paths, which are consisted of local and global structure of network nodes, as a basis for network embedding and uses the Skip-gram model to learn representation vector of nodes. Subsequently, stochastic gradient descent is applied to optimize our objective function and learn the final representation of nodes. Our method can be scalable when new nodes are appended to network without information loss. Parallelize generation of customized random walks is also used for speeding up CARE. We evaluate the performance of CARE on multi label classification and link prediction tasks. Experimental results on various networks indicate that the proposed method outperforms others in both Micro and Macro-f1 measures for different size of training data.

Review-based recommender systems have gained noticeable ground in recent years. In addition to the rating scores, those systems are enriched with textual evaluations of items by the users. Neural language processing models, on the other hand, have already found application in recommender systems, mainly as a means of encoding user preference data, with the actual textual description of items serving only as side information. In this paper, a novel approach to incorporating the aforementioned models into the recommendation process is presented. Initially, a neural language processing model and more specifically the paragraph vector model is used to encode textual user reviews of variable length into feature vectors of fixed length. Subsequently this information is fused along with the rating scores in a probabilistic matrix factorization algorithm, based on maximum a-posteriori estimation. The resulting system, ParVecMF, is compared to a ratings' matrix factorization approach on a reference dataset. The obtained preliminary results on a set of two metrics are encouraging and may stimulate further research in this area.

We introduce negative binomial matrix factorization (NBMF), a matrix factorization technique specially designed for analyzing over-dispersed count data. It can be viewed as an extension of Poisson matrix factorization (PF) perturbed by a multiplicative term which models exposure. This term brings a degree of freedom for controlling the dispersion, making NBMF more robust to outliers. We show that NBMF allows to skip traditional pre-processing stages, such as binarization, which lead to loss of information. Two estimation approaches are presented: maximum likelihood and variational Bayes inference. We test our model with a recommendation task and show its ability to predict user tastes with better precision than PF.

We present DeepWalk, a novel approach for learning latent representations of vertices in a network. These latent representations encode social relations in a continuous vector space, which is easily exploited by statistical models. DeepWalk generalizes recent advancements in language modeling and unsupervised feature learning (or deep learning) from sequences of words to graphs. DeepWalk uses local information obtained from truncated random walks to learn latent representations by treating walks as the equivalent of sentences. We demonstrate DeepWalk's latent representations on several multi-label network classification tasks for social networks such as BlogCatalog, Flickr, and YouTube. Our results show that DeepWalk outperforms challenging baselines which are allowed a global view of the network, especially in the presence of missing information. DeepWalk's representations can provide $F_1$ scores up to 10% higher than competing methods when labeled data is sparse. In some experiments, DeepWalk's representations are able to outperform all baseline methods while using 60% less training data. DeepWalk is also scalable. It is an online learning algorithm which builds useful incremental results, and is trivially parallelizable. These qualities make it suitable for a broad class of real world applications such as network classification, and anomaly detection.

北京阿比特科技有限公司