The problem of finding paths in temporal graphs has been recently considered due to its many applications. In this paper we consider a variant of the problem that, given a vertex-colored temporal graph, asks for a path whose vertices have distinct colors and include the maximum number of colors. We study the approximation complexity of the problem and we provide an inapproximability lower bound. Then we present a heuristic for the problem and an experimental evaluation of our heuristic, both on synthetic and real-world graphs.
Logical reasoning over Knowledge Graphs (KGs) is a fundamental technique that can provide efficient querying mechanism over large and incomplete databases. Current approaches employ spatial geometries such as boxes to learn query representations that encompass the answer entities and model the logical operations of projection and intersection. However, their geometry is restrictive and leads to non-smooth strict boundaries, which further results in ambiguous answer entities. Furthermore, previous works propose transformation tricks to handle unions which results in non-closure and, thus, cannot be chained in a stream. In this paper, we propose a Probabilistic Entity Representation Model (PERM) to encode entities as a Multivariate Gaussian density with mean and covariance parameters to capture its semantic position and smooth decision boundary, respectively. Additionally, we also define the closed logical operations of projection, intersection, and union that can be aggregated using an end-to-end objective function. On the logical query reasoning problem, we demonstrate that the proposed PERM significantly outperforms the state-of-the-art methods on various public benchmark KG datasets on standard evaluation metrics. We also evaluate PERM's competence on a COVID-19 drug-repurposing case study and show that our proposed work is able to recommend drugs with substantially better F1 than current methods. Finally, we demonstrate the working of our PERM's query answering process through a low-dimensional visualization of the Gaussian representations.
One of the most common approaches to the analysis of dynamic networks is through time-window aggregation. The resulting representation is a sequence of static networks, i.e. the snapshot graph. Despite this representation being widely used in the literature, a general framework to evaluate the soundness of snapshot graphs is still missing. In this article, we propose two scores to quantify conflicting objectives: Stability measures how much stable the sequence of snapshots is, while Fidelity measures the loss of information compared to the original data. We also develop a technique of targeted filtering of the links, to simplify the original temporal network. Our framework is tested on datasets of proximity and face-to-face interactions.
We present exact and heuristic algorithms that find, for a given family of graphs, a graph that contains each member of the family as an induced subgraph. For $0 \leq k \leq 6$, we give the minimum number of vertices $f(k)$ in a graph containing all $k$-vertex graphs as induced subgraphs, and show that $16 \leq f(7) \leq 18$. For $0 \leq k \leq 5$, we also give the counts of such graphs, as generated by brute-force computer search. We give additional results for small graphs containing all trees on $k$ vertices.
We extract a core principle underlying seemingly different fundamental distributed settings, showing sparsity awareness may induce faster algorithms for problems in these settings. To leverage this, we establish a new framework by developing an intermediate auxiliary model weak enough to be simulated in the CONGEST model given low mixing time, as well as in the recently introduced HYBRID model. We prove that despite imposing harsh restrictions, this artificial model allows balancing massive data transfers with high bandwidth utilization. We exemplify the power of our methods, by deriving shortest-paths algorithms improving upon the state-of-the-art. Specifically, we show the following for graphs of $n$ nodes: A $(3+\epsilon)$ approximation for weighted APSP in $(n/\delta)\tau_{mix}\cdot 2^{O(\sqrt\log n)}$ rounds in the CONGEST model, where $\delta$ is the minimum degree of the graph and $\tau_{mix}$ is its mixing time. For graphs with $\delta=\tau_{mix}\cdot 2^{\omega(\sqrt\log n)}$, this takes $o(n)$ rounds, despite the $\Omega(n)$ lower bound for general graphs [Nanongkai, STOC'14]. An $(n^{7/6}/m^{1/2}+n^2/m)\cdot\tau_{mix}\cdot 2^{O(\sqrt\log n)}$-round exact SSSP algorithm in the CONGNEST model, for graphs with $m$ edges and a mixing time of $\tau_{mix}$. This improves upon the algorithm of [Chechik and Mukhtar, PODC'20] for significant ranges of values of $m$ and $ \tau_{mix}$. A CONGESTED CLIQUE simulation in the CONGEST model improving upon the state-of-the-art simulation of [Ghaffari, Kuhn, and SU, PODC'17] by a factor proportional to the average degree in the graph. An $\tilde O(n^{5/17}/\epsilon^9)$-round algorithm for a $(1+\epsilon)$ approximation for SSSP in the HYBRID model. The only previous $o(n^{1/3})$ round algorithm for distance approximations in this model is for a much larger factor [Augustine, Hinnenthal, Kuhn, Scheideler, Schneider, SODA'20].
The caterpillar associahedron $\mathcal{A}(G)$ is a polytope arising from the rotation graph of search trees on a caterpillar tree $G$, generalizing the rotation graph of binary search trees (BSTs) and thus the conventional associahedron. We show that the diameter of $\mathcal{A}(G)$ is $\Theta(n + m \cdot (H+1))$, where $n$ is the number of vertices, $m$ is the number of leaves, and $H$ is the entropy of the leaf distribution of $G$. Our proofs reveal a strong connection between caterpillar associahedra and searching in BSTs. We prove the lower bound using Wilber's first lower bound for dynamic BSTs, and the upper bound by reducing the problem to searching in static BSTs.
A hole is an induced cycle of length at least 4, and an odd hole is a hole of odd length. A full house is a graph composed by a vertex adjacent to both ends of an edge in $K_4$ . Let $H$ be the complement of a cycle on 7 vertices. Chudnovsky et al [6] proved that every (odd hole, $K_4$)-free graph is 4-colorable and is 3-colorable if it does not has $H$ as an induced subgraph. In this paper, we use the proving technique of Chudnovsky et al to generalize this conclusion to (odd hole, full house)-free graphs, and prove that for (odd hole, full house)-free graph $G$, $\chi(G)\le \omega(G)+1$, and the equality holds if and only if $\omega(G)=3$ and $G$ has $H$ as an induced subgraph.
Since counting subgraphs in general graphs is, by and large, a computationally demanding problem, it is natural to try and design fast algorithms for restricted families of graphs. One such family that has been extensively studied is that of graphs of bounded degeneracy (e.g., planar graphs). This line of work, which started in the early 80's, culminated in a recent work of Gishboliner et al., which highlighted the importance of the task of counting homomorphic copies of cycles (i.e., cyclic walks) in graphs of bounded degeneracy. Our main result in this paper is a surprisingly tight relation between the above task and the well-studied problem of {\em detecting (standard) copies} of directed cycles in {\em general directed} graphs. More precisely, we prove the following: 1. One can compute the number of homomorphic copies of $C_{2k}$ and $C_{2k+1}$ in $n$-vertex graphs of bounded degeneracy in time $\tilde{O}(n^{d_{k}})$, where the fastest {\em known} algorithm for detecting directed copies of $C_k$ in general $m$-edge digraphs runs in time $\tilde{O}(m^{d_{k}})$. 2. Conversely, one can transform any $O(n^{b_{k}})$ algorithm for computing the number of homomorphic copies of $C_{2k}$ or of $C_{2k+1}$ in $n$-vertex graphs of bounded degeneracy, into an $\tilde{O}(m^{b_{k}})$ time algorithm for detecting directed copies of $C_k$ in general $m$-edge digraphs. We emphasize that our first result does not use a black-box reduction (as opposed to the second result which does). Instead, we design an algorithm for computing the number of $C_k$-homomorphisms in degenerate graphs and show that one part of its {\em analysis} can be reduced to the analysis of the fastest known algorithm for detecting directed cycles in general digraphs, which was carried out in a recent breakthrough of Dalirrooyfard, Vuong and Vassilevska Williams.
Inferring missing facts in temporal knowledge graphs (TKGs) is a fundamental and challenging task. Previous works have approached this problem by augmenting methods for static knowledge graphs to leverage time-dependent representations. However, these methods do not explicitly leverage multi-hop structural information and temporal facts from recent time steps to enhance their predictions. Additionally, prior work does not explicitly address the temporal sparsity and variability of entity distributions in TKGs. We propose the Temporal Message Passing (TeMP) framework to address these challenges by combining graph neural networks, temporal dynamics models, data imputation and frequency-based gating techniques. Experiments on standard TKG tasks show that our approach provides substantial gains compared to the previous state of the art, achieving a 10.7% average relative improvement in Hits@10 across three standard benchmarks. Our analysis also reveals important sources of variability both within and across TKG datasets, and we introduce several simple but strong baselines that outperform the prior state of the art in certain settings.
Graph Neural Networks (GNNs) for representation learning of graphs broadly follow a neighborhood aggregation framework, where the representation vector of a node is computed by recursively aggregating and transforming feature vectors of its neighboring nodes. Many GNN variants have been proposed and have achieved state-of-the-art results on both node and graph classification tasks. However, despite GNNs revolutionizing graph representation learning, there is limited understanding of their representational properties and limitations. Here, we present a theoretical framework for analyzing the expressive power of GNNs in capturing different graph structures. Our results characterize the discriminative power of popular GNN variants, such as Graph Convolutional Networks and GraphSAGE, and show that they cannot learn to distinguish certain simple graph structures. We then develop a simple architecture that is provably the most expressive among the class of GNNs and is as powerful as the Weisfeiler-Lehman graph isomorphism test. We empirically validate our theoretical findings on a number of graph classification benchmarks, and demonstrate that our model achieves state-of-the-art performance.
We introduce a new neural architecture to learn the conditional probability of an output sequence with elements that are discrete tokens corresponding to positions in an input sequence. Such problems cannot be trivially addressed by existent approaches such as sequence-to-sequence and Neural Turing Machines, because the number of target classes in each step of the output depends on the length of the input, which is variable. Problems such as sorting variable sized sequences, and various combinatorial optimization problems belong to this class. Our model solves the problem of variable size output dictionaries using a recently proposed mechanism of neural attention. It differs from the previous attention attempts in that, instead of using attention to blend hidden units of an encoder to a context vector at each decoder step, it uses attention as a pointer to select a member of the input sequence as the output. We call this architecture a Pointer Net (Ptr-Net). We show Ptr-Nets can be used to learn approximate solutions to three challenging geometric problems -- finding planar convex hulls, computing Delaunay triangulations, and the planar Travelling Salesman Problem -- using training examples alone. Ptr-Nets not only improve over sequence-to-sequence with input attention, but also allow us to generalize to variable size output dictionaries. We show that the learnt models generalize beyond the maximum lengths they were trained on. We hope our results on these tasks will encourage a broader exploration of neural learning for discrete problems.