Undirected graphs are frequently used to model networks. The topology of an undirected graph G can be captured by an adjacency matrix; this matrix in turn can be visualized directly to give insight into the graph structure. Which visual patterns appear in such a matrix visualization depends on the ordering of its rows and columns. Formally defining the quality of an ordering and then automatically computing a high-quality ordering are both challenging problems; however, effective heuristics exist and are used in practice. Often, graphs exist as part of a collection of graphs on the same set of vertices. To visualize such graph collections, we need a single ordering that works well for all matrices simultaneously. The current state-of-the-art solves this problem by taking a (weighted) union over all graphs and applying existing heuristics. However, this union leads to a loss of information, specifically in those parts of the graphs which are different. We propose a collection-aware approach to avoid this loss of information and apply it to two popular heuristic methods: leaf order and barycenter. The de-facto standard computational quality metrics for matrix ordering capture only block-diagonal patterns (cliques). Instead, we propose to use Moran's I, a spatial auto-correlation metric, which captures the full range of established patterns. The popular leaf order method heuristically optimizes a similar measure which supports the use of Moran's I in this context. We evaluated our methods for simultaneous orderings on real-world datasets using Moran's I as the quality metric. Our results show that our collection-aware approach matches or improves performance compared to the union approach, depending on the similarity of the graphs in the collection. Specifically, our Moran's I-based collection-aware leaf order implementation consistently outperforms other implementations.
The metric dimension dim(G) of a graph $G$ is the minimum cardinality of a subset $S$ of vertices of $G$ such that each vertex of $G$ is uniquely determined by its distances to $S$. It is well-known that the metric dimension of a graph can be drastically increased by the modification of a single edge. Our main result consists in proving that the increase of the metric dimension of an edge addition can be amortized in the sense that if the graph consists of a spanning tree $T$ plus $c$ edges, then the metric dimension of $G$ is at most the metric dimension of $T$ plus $6c$. We then use this result to prove a weakening of a conjecture of Eroh et al. The zero forcing number $Z(G)$ of $G$ is the minimum cardinality of a subset $S$ of black vertices (whereas the other vertices are colored white) of $G$ such that all the vertices will turned black after applying finitely many times the following rule: a white vertex is turned black if it is the only white neighbor of a black vertex. Eroh et al. conjectured that, for any graph $G$, $dim(G)\leq Z(G) + c(G)$, where $c(G)$ is the number of edges that have to be removed from $G$ to get a forest. They proved the conjecture is true for trees and unicyclic graphs. We prove a weaker version of the conjecture: $dim(G)\leq Z(G)+6c(G)$ holds for any graph. We also prove that the conjecture is true for graphs with edge disjoint cycles, widely generalizing the unicyclic result of Eroh et al.
Accurate prediction of short-term OD Matrix (i.e. the distribution of passenger flows from various origins to destinations) is a crucial task in metro systems. It is highly challenging due to the constantly changing nature of many impacting factors and the real-time delayed data collection problem. Recently, some deep learning-based models have been proposed for OD Matrix forecasting in ride-hailing and high way traffic scenarios. However, these models can not sufficiently capture the complex spatiotemporal correlation between stations in metro networks due to their different prior knowledge and contextual settings. In this paper we propose a hybrid framework Multi-view TRGRU to address OD metro matrix prediction. In particular, it uses three modules to model three flow change patterns: recent trend, daily trend, weekly trend. In each module, a multi-view representation based on embedding for each station is constructed and fed into a transformer based gated recurrent structure so as to capture the dynamic spatial dependency in OD flows of different stations by a global self-attention mechanism. Extensive experiments on three large-scale, real-world metro datasets demonstrate the superiority of our Multi-view TRGRU over other competitors.
This paper investigates the problem of online statistical inference of model parameters in stochastic optimization problems via the Kiefer-Wolfowitz algorithm with random search directions. We first present the asymptotic distribution for the Polyak-Ruppert-averaging type Kiefer-Wolfowitz (AKW) estimators, whose asymptotic covariance matrices depend on the function-value query complexity and the distribution of search directions. The distributional result reflects the trade-off between statistical efficiency and function query complexity. We further analyze the choices of random search directions to minimize the asymptotic covariance matrix, and conclude that the optimal search direction depends on the optimality criteria with respect to different summary statistics of the Fisher information matrix. Based on the asymptotic distribution result, we conduct online statistical inference by providing two construction procedures of valid confidence intervals. We provide numerical experiments verifying our theoretical results with the practical effectiveness of the procedures.
Multiplying matrices is among the most fundamental and compute-intensive operations in machine learning. Consequently, there has been significant work on efficiently approximating matrix multiplies. We introduce a learning-based algorithm for this task that greatly outperforms existing methods. Experiments using hundreds of matrices from diverse domains show that it often runs $100\times$ faster than exact matrix products and $10\times$ faster than current approximate methods. In the common case that one matrix is known ahead of time, our method also has the interesting property that it requires zero multiply-adds. These results suggest that a mixture of hashing, averaging, and byte shuffling$-$the core operations of our method$-$could be a more promising building block for machine learning than the sparsified, factorized, and/or scalar quantized matrix products that have recently been the focus of substantial research and hardware investment.
Hierarchical abstractions are a methodology for solving large-scale graph problems in various disciplines. Coarsening is one such approach: it generates a pyramid of graphs whereby the one in the next level is a structural summary of the prior one. With a long history in scientific computing, many coarsening strategies were developed based on mathematically driven heuristics. Recently, resurgent interests exist in deep learning to design hierarchical methods learnable through differentiable parameterization. These approaches are paired with downstream tasks for supervised learning. In practice, however, supervised signals (e.g., labels) are scarce and are often laborious to obtain. In this work, we propose an unsupervised approach, coined OTCoarsening, with the use of optimal transport. Both the coarsening matrix and the transport cost matrix are parameterized, so that an optimal coarsening strategy can be learned and tailored for a given set of graphs. We demonstrate that the proposed approach produces meaningful coarse graphs and yields competitive performance compared with supervised methods for graph classification and regression.
The area of Data Analytics on graphs promises a paradigm shift as we approach information processing of classes of data, which are typically acquired on irregular but structured domains (social networks, various ad-hoc sensor networks). Yet, despite its long history, current approaches mostly focus on the optimization of graphs themselves, rather than on directly inferring learning strategies, such as detection, estimation, statistical and probabilistic inference, clustering and separation from signals and data acquired on graphs. To fill this void, we first revisit graph topologies from a Data Analytics point of view, and establish a taxonomy of graph networks through a linear algebraic formalism of graph topology (vertices, connections, directivity). This serves as a basis for spectral analysis of graphs, whereby the eigenvalues and eigenvectors of graph Laplacian and adjacency matrices are shown to convey physical meaning related to both graph topology and higher-order graph properties, such as cuts, walks, paths, and neighborhoods. Next, to illustrate estimation strategies performed on graph signals, spectral analysis of graphs is introduced through eigenanalysis of mathematical descriptors of graphs and in a generic way. Finally, a framework for vertex clustering and graph segmentation is established based on graph spectral representation (eigenanalysis) which illustrates the power of graphs in various data association tasks. The supporting examples demonstrate the promise of Graph Data Analytics in modeling structural and functional/semantic inferences. At the same time, Part I serves as a basis for Part II and Part III which deal with theory, methods and applications of processing Data on Graphs and Graph Topology Learning from data.
Attributed graph clustering is challenging as it requires joint modelling of graph structures and node attributes. Recent progress on graph convolutional networks has proved that graph convolution is effective in combining structural and content information, and several recent methods based on it have achieved promising clustering performance on some real attributed networks. However, there is limited understanding of how graph convolution affects clustering performance and how to properly use it to optimize performance for different graphs. Existing methods essentially use graph convolution of a fixed and low order that only takes into account neighbours within a few hops of each node, which underutilizes node relations and ignores the diversity of graphs. In this paper, we propose an adaptive graph convolution method for attributed graph clustering that exploits high-order graph convolution to capture global cluster structure and adaptively selects the appropriate order for different graphs. We establish the validity of our method by theoretical analysis and extensive experiments on benchmark datasets. Empirical results show that our method compares favourably with state-of-the-art methods.
We show that for the problem of testing if a matrix $A \in F^{n \times n}$ has rank at most $d$, or requires changing an $\epsilon$-fraction of entries to have rank at most $d$, there is a non-adaptive query algorithm making $\widetilde{O}(d^2/\epsilon)$ queries. Our algorithm works for any field $F$. This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix. Our algorithm is the first such algorithm which does not read a submatrix, and instead reads a carefully selected non-adaptive pattern of entries in rows and columns of $A$. We complement our algorithm with a matching query complexity lower bound for non-adaptive testers over any field. We also give tight bounds of $\widetilde{\Theta}(d^2)$ queries in the sensing model for which query access comes in the form of $\langle X_i, A\rangle:=tr(X_i^\top A)$; perhaps surprisingly these bounds do not depend on $\epsilon$. We next develop a novel property testing framework for testing numerical properties of a real-valued matrix $A$ more generally, which includes the stable rank, Schatten-$p$ norms, and SVD entropy. Specifically, we propose a bounded entry model, where $A$ is required to have entries bounded by $1$ in absolute value. We give upper and lower bounds for a wide range of problems in this model, and discuss connections to the sensing model above.
With the rapid growth of knowledge bases (KBs), question answering over knowledge base, a.k.a. KBQA has drawn huge attention in recent years. Most of the existing KBQA methods follow so called encoder-compare framework. They map the question and the KB facts to a common embedding space, in which the similarity between the question vector and the fact vectors can be conveniently computed. This, however, inevitably loses original words interaction information. To preserve more original information, we propose an attentive recurrent neural network with similarity matrix based convolutional neural network (AR-SMCNN) model, which is able to capture comprehensive hierarchical information utilizing the advantages of both RNN and CNN. We use RNN to capture semantic-level correlation by its sequential modeling nature, and use an attention mechanism to keep track of the entities and relations simultaneously. Meanwhile, we use a similarity matrix based CNN with two-directions pooling to extract literal-level words interaction matching utilizing CNNs strength of modeling spatial correlation among data. Moreover, we have developed a new heuristic extension method for entity detection, which significantly decreases the effect of noise. Our method has outperformed the state-of-the-arts on SimpleQuestion benchmark in both accuracy and efficiency.
While advances in computing resources have made processing enormous amounts of data possible, human ability to identify patterns in such data has not scaled accordingly. Thus, efficient computational methods for condensing and simplifying data are becoming vital for extracting actionable insights. In particular, while data summarization techniques have been studied extensively, only recently has summarizing interconnected data, or graphs, become popular. This survey is a structured, comprehensive overview of the state-of-the-art methods for summarizing graph data. We first broach the motivation behind and the challenges of graph summarization. We then categorize summarization approaches by the type of graphs taken as input and further organize each category by core methodology. Finally, we discuss applications of summarization on real-world graphs and conclude by describing some open problems in the field.