We consider the problem of partitioning a graph into a non-fixed number of non-overlapping subgraphs of maximum density. The density of a partition is the sum of the densities of the subgraphs, where the density of the subgraph is the ratio of its number of edges and its number of vertices. This problem, called Dense Graph Partition, is known to be NP-hard on general graphs and polynomial-time solvable on trees, and polynomial-time 2-approximable. In this paper we study the restriction of Dense Graph Partition to particular sparse and dense graph classes. In particular, we prove that it is NP-hard on dense bipartite graphs as well as on cubic graphs. On dense bipartite graphs, we further show that it is W[2]-hard parameterized by the number of sets in the optimum solution. On dense graphs on $n$ vertices, it is polynomial-time solvable on graphs with minimum degree $n-3$ and NP-hard on $(n-4)$-regular graphs. We prove that it is polynomial-time $4/3$-approximable on cubic graphs and admits an efficient polynomial-time approximation scheme on $(n-4)$-regular graphs.
This paper considers the problem of Byzantine dispersion and extends previous work along several parameters. The problem of Byzantine dispersion asks: given $n$ robots, up to $f$ of which are Byzantine, initially placed arbitrarily on an $n$ node anonymous graph, design a terminating algorithm to be run by the robots such that they eventually reach a configuration where each node has at most one non-Byzantine robot on it. Previous work solved this problem for rings and tolerated up to $n-1$ Byzantine robots. In this paper, we investigate the problem on more general graphs. We first develop an algorithm that tolerates up to $n-1$ Byzantine robots and works for a more general class of graphs. We then develop an algorithm that works for any graph but tolerates a lesser number of Byzantine robots. We subsequently turn our focus to the strength of the Byzantine robots. Previous work considers only ``weak" Byzantine robots that cannot fake their IDs. We develop an algorithm that solves the problem when Byzantine robots are not weak and can fake IDs. Finally, we study the situation where the number of the robots is not $n$ but some $k$. We show that in such a scenario, the number of Byzantine robots that can be tolerated is severely restricted. Specifically, we show that it is impossible to deterministically solve Byzantine dispersion when $\lceil k/n \rceil > \lceil (k-f)/n \rceil$.
A vertebrate interval graph is an interval graph in which the maximum size of a set of independent vertices equals the number of maximal cliques. For any fixed $v \ge 1$, there is a polynomial-time algorithm for deciding whether a vertebrate interval graph admits a vertex partition into two induced subgraphs with claw number at most $v$. In particular, when $v = 2$, whether a vertebrate interval graph can be partitioned into two proper interval graphs can be decided in polynomial time.
A $2$-distance $k$-coloring of a graph is a proper $k$-coloring of the vertices where vertices at distance at most 2 cannot share the same color. We prove the existence of a $2$-distance ($\Delta+2$)-coloring for graphs with maximum average degree less than $\frac{8}{3}$ (resp. $\frac{14}{5}$) and maximum degree $\Delta\geq 6$ (resp. $\Delta\geq 10$). As a corollary, every planar graph with girth at least $8$ (resp. $7$) and maximum degree $\Delta\geq 6$ (resp. $\Delta\geq 10$) admits a $2$-distance $(\Delta+2)$-coloring.
Approximate-message passing (AMP) algorithms have become an important element of high-dimensional statistical inference, mostly due to their adaptability and concentration properties, the state evolution (SE) equations. This is demonstrated by the growing number of new iterations proposed for increasingly complex problems, ranging from multi-layer inference to low-rank matrix estimation with elaborate priors. In this paper, we address the following questions: is there a structure underlying all AMP iterations that unifies them in a common framework? Can we use such a structure to give a modular proof of state evolution equations, adaptable to new AMP iterations without reproducing each time the full argument ? We propose an answer to both questions, showing that AMP instances can be generically indexed by an oriented graph. This enables to give a unified interpretation of these iterations, independent from the problem they solve, and a way of composing them arbitrarily. We then show that all AMP iterations indexed by such a graph admit rigorous SE equations, extending the reach of previous proofs, and proving a number of recent heuristic derivations of those equations. Our proof naturally includes non-separable functions and we show how existing refinements, such as spatial coupling or matrix-valued variables, can be combined with our framework.
In the \textsc{Waypoint Routing Problem} one is given an undirected capacitated and weighted graph $G$, a source-destination pair $s,t\in V(G)$ and a set $W\subseteq V(G)$, of \emph{waypoints}. The task is to find a walk which starts at the source vertex $s$, visits, in any order, all waypoints, ends at the destination vertex $t$, respects edge capacities, that is, traverses each edge at most as many times as is its capacity, and minimizes the cost computed as the sum of costs of traversed edges with multiplicities. We study the problem for graphs of bounded treewidth and present a new algorithm for the problem working in $2^{O(\mathrm{tw})}\cdot n$ time, significantly improving upon the previously known algorithms. We also show that this running time is optimal for the problem under Exponential Time Hypothesis.
Sequential recommendation aims to leverage users' historical behaviors to predict their next interaction. Existing works have not yet addressed two main challenges in sequential recommendation. First, user behaviors in their rich historical sequences are often implicit and noisy preference signals, they cannot sufficiently reflect users' actual preferences. In addition, users' dynamic preferences often change rapidly over time, and hence it is difficult to capture user patterns in their historical sequences. In this work, we propose a graph neural network model called SURGE (short for SeqUential Recommendation with Graph neural nEtworks) to address these two issues. Specifically, SURGE integrates different types of preferences in long-term user behaviors into clusters in the graph by re-constructing loose item sequences into tight item-item interest graphs based on metric learning. This helps explicitly distinguish users' core interests, by forming dense clusters in the interest graph. Then, we perform cluster-aware and query-aware graph convolutional propagation and graph pooling on the constructed graph. It dynamically fuses and extracts users' current activated core interests from noisy user behavior sequences. We conduct extensive experiments on both public and proprietary industrial datasets. Experimental results demonstrate significant performance gains of our proposed method compared to state-of-the-art methods. Further studies on sequence length confirm that our method can model long behavioral sequences effectively and efficiently.
Graph neural networks (GNNs) are typically applied to static graphs that are assumed to be known upfront. This static input structure is often informed purely by insight of the machine learning practitioner, and might not be optimal for the actual task the GNN is solving. In absence of reliable domain expertise, one might resort to inferring the latent graph structure, which is often difficult due to the vast search space of possible graphs. Here we introduce Pointer Graph Networks (PGNs) which augment sets or graphs with additional inferred edges for improved model expressivity. PGNs allow each node to dynamically point to another node, followed by message passing over these pointers. The sparsity of this adaptable graph structure makes learning tractable while still being sufficiently expressive to simulate complex algorithms. Critically, the pointing mechanism is directly supervised to model long-term sequences of operations on classical data structures, incorporating useful structural inductive biases from theoretical computer science. Qualitatively, we demonstrate that PGNs can learn parallelisable variants of pointer-based data structures, namely disjoint set unions and link/cut trees. PGNs generalise out-of-distribution to 5x larger test inputs on dynamic graph connectivity tasks, outperforming unrestricted GNNs and Deep Sets.
We propose a scalable Gromov-Wasserstein learning (S-GWL) method and establish a novel and theoretically-supported paradigm for large-scale graph analysis. The proposed method is based on the fact that Gromov-Wasserstein discrepancy is a pseudometric on graphs. Given two graphs, the optimal transport associated with their Gromov-Wasserstein discrepancy provides the correspondence between their nodes and achieves graph matching. When one of the graphs has isolated but self-connected nodes ($i.e.$, a disconnected graph), the optimal transport indicates the clustering structure of the other graph and achieves graph partitioning. Using this concept, we extend our method to multi-graph partitioning and matching by learning a Gromov-Wasserstein barycenter graph for multiple observed graphs; the barycenter graph plays the role of the disconnected graph, and since it is learned, so is the clustering. Our method combines a recursive $K$-partition mechanism with a regularized proximal gradient algorithm, whose time complexity is $\mathcal{O}(K(E+V)\log_K V)$ for graphs with $V$ nodes and $E$ edges. To our knowledge, our method is the first attempt to make Gromov-Wasserstein discrepancy applicable to large-scale graph analysis and unify graph partitioning and matching into the same framework. It outperforms state-of-the-art graph partitioning and matching methods, achieving a trade-off between accuracy and efficiency.
Many question answering systems over knowledge graphs rely on entity and relation linking components in order to connect the natural language input to the underlying knowledge graph. Traditionally, entity linking and relation linking have been performed either as dependent sequential tasks or as independent parallel tasks. In this paper, we propose a framework called EARL, which performs entity linking and relation linking as a joint task. EARL implements two different solution strategies for which we provide a comparative analysis in this paper: The first strategy is a formalisation of the joint entity and relation linking tasks as an instance of the Generalised Travelling Salesman Problem (GTSP). In order to be computationally feasible, we employ approximate GTSP solvers. The second strategy uses machine learning in order to exploit the connection density between nodes in the knowledge graph. It relies on three base features and re-ranking steps in order to predict entities and relations. We compare the strategies and evaluate them on a dataset with 5000 questions. Both strategies significantly outperform the current state-of-the-art approaches for entity and relation linking.
Knowledge Graph Embedding methods aim at representing entities and relations in a knowledge base as points or vectors in a continuous vector space. Several approaches using embeddings have shown promising results on tasks such as link prediction, entity recommendation, question answering, and triplet classification. However, only a few methods can compute low-dimensional embeddings of very large knowledge bases. In this paper, we propose KG2Vec, a novel approach to Knowledge Graph Embedding based on the skip-gram model. Instead of using a predefined scoring function, we learn it relying on Long Short-Term Memories. We evaluated the goodness of our embeddings on knowledge graph completion and show that KG2Vec is comparable to the quality of the scalable state-of-the-art approaches and can process large graphs by parsing more than a hundred million triples in less than 6 hours on common hardware.