Hypergraphs capture multi-way relationships in data, and they have consequently seen a number of applications in higher-order network analysis, computer vision, geometry processing, and machine learning. In this paper, we develop the theoretical foundations in studying the space of hypergraphs using ingredients from optimal transport. By enriching a hypergraph with probability measures on its nodes and hyperedges, as well as relational information capturing local and global structure, we obtain a general and robust framework for studying the collection of all hypergraphs. First, we introduce a hypergraph distance based on the co-optimal transport framework of Redko et al. and study its theoretical properties. Second, we formalize common methods for transforming a hypergraph into a graph as maps from the space of hypergraphs to the space of graphs and study their functorial properties and Lipschitz bounds. Finally, we demonstrate the versatility of our Hypergraph Co-Optimal Transport (HyperCOT) framework through various examples.
We consider optimization problems in which the goal is find a $k$-dimensional subspace of $\reals^n$, $k<<n$, which minimizes a convex and smooth loss. Such problemsgeneralize the fundamental task of principal component analysis (PCA) to include robust and sparse counterparts, and logistic PCA for binary data, among others. While this problem is not convex it admits natural algorithms with very efficient iterations and memory requirements, which is highly desired in high-dimensional regimes however, arguing about their fast convergence to a global optimal solution is difficult. On the other hand, there exists a simple convex relaxation for which convergence to the global optimum is straightforward, however corresponding algorithms are not efficient when the dimension is very large. In this work we present a natural deterministic sufficient condition so that the optimal solution to the convex relaxation is unique and is also the optimal solution to the original nonconvex problem. Mainly, we prove that under this condition, a natural highly-efficient nonconvex gradient method, which we refer to as \textit{gradient orthogonal iteration}, when initialized with a "warm-start", converges linearly for the nonconvex problem. We also establish similar results for the nonconvex projected gradient method, and the Frank-Wolfe method when applied to the convex relaxation. We conclude with empirical evidence on synthetic data which demonstrate the appeal of our approach.
Real-world optimization problems are generally not just black-box problems, but also involve mixed types of inputs in which discrete and continuous variables coexist. Such mixed-space optimization possesses the primary challenge of modeling complex interactions between the inputs. In this work, we propose a novel yet simple approach that entails exploiting the graph data structure to model the underlying relationship between variables, i.e., variables as nodes and interactions defined by edges. Then, a variational graph autoencoder is used to naturally take the interactions into account. We first provide empirical evidence of the existence of such graph structures and then suggest a joint framework of graph structure learning and latent space optimization to adaptively search for optimal graph connectivity. Experimental results demonstrate that our method shows remarkable performance, exceeding the existing approaches with significant computational efficiency for a number of synthetic and real-world tasks.
Existing works on motion deblurring either ignore the effects of depth-dependent blur or work with the assumption of a multi-layered scene wherein each layer is modeled in the form of fronto-parallel plane. In this work, we consider the case of 3D scenes with piecewise planar structure i.e., a scene that can be modeled as a combination of multiple planes with arbitrary orientations. We first propose an approach for estimation of normal of a planar scene from a single motion blurred observation. We then develop an algorithm for automatic recovery of number of planes, the parameters corresponding to each plane, and camera motion from a single motion blurred image of a multiplanar 3D scene. Finally, we propose a first-of-its-kind approach to recover the planar geometry and latent image of the scene by adopting an alternating minimization framework built on our findings. Experiments on synthetic and real data reveal that our proposed method achieves state-of-the-art results.
The aim of this work is to develop a fully-distributed algorithmic framework for training graph convolutional networks (GCNs). The proposed method is able to exploit the meaningful relational structure of the input data, which are collected by a set of agents that communicate over a sparse network topology. After formulating the centralized GCN training problem, we first show how to make inference in a distributed scenario where the underlying data graph is split among different agents. Then, we propose a distributed gradient descent procedure to solve the GCN training problem. The resulting model distributes computation along three lines: during inference, during back-propagation, and during optimization. Convergence to stationary solutions of the GCN training problem is also established under mild conditions. Finally, we propose an optimization criterion to design the communication topology between agents in order to match with the graph describing data relationships. A wide set of numerical results validate our proposal. To the best of our knowledge, this is the first work combining graph convolutional neural networks with distributed optimization.
The problem of Approximate Nearest Neighbor (ANN) search is fundamental in computer science and has benefited from significant progress in the past couple of decades. However, most work has been devoted to pointsets whereas complex shapes have not been sufficiently treated. Here, we focus on distance functions between discretized curves in Euclidean space: they appear in a wide range of applications, from road segments to time-series in general dimension. For $\ell_p$-products of Euclidean metrics, for any $p$, we design simple and efficient data structures for ANN, based on randomized projections, which are of independent interest. They serve to solve proximity problems under a notion of distance between discretized curves, which generalizes both discrete Fr\'echet and Dynamic Time Warping distances. These are the most popular and practical approaches to comparing such curves. We offer the first data structures and query algorithms for ANN with arbitrarily good approximation factor, at the expense of increasing space usage and preprocessing time over existing methods. Query time complexity is comparable or significantly improved by our algorithms, our algorithm is especially efficient when the length of the curves is bounded.
Graph Convolutional Networks (GCNs) have recently become the primary choice for learning from graph-structured data, superseding hash fingerprints in representing chemical compounds. However, GCNs lack the ability to take into account the ordering of node neighbors, even when there is a geometric interpretation of the graph vertices that provides an order based on their spatial positions. To remedy this issue, we propose Geometric Graph Convolutional Network (geo-GCN) which uses spatial features to efficiently learn from graphs that can be naturally located in space. Our contribution is threefold: we propose a GCN-inspired architecture which (i) leverages node positions, (ii) is a proper generalisation of both GCNs and Convolutional Neural Networks (CNNs), (iii) benefits from augmentation which further improves the performance and assures invariance with respect to the desired properties. Empirically, geo-GCN outperforms state-of-the-art graph-based methods on image classification and chemical tasks.
Lots of learning tasks require dealing with graph data which contains rich relation information among elements. Modeling physics system, learning molecular fingerprints, predicting protein interface, and classifying diseases require a model to learn from graph inputs. In other domains such as learning from non-structural data like texts and images, reasoning on extracted structures, like the dependency tree of sentences and the scene graph of images, is an important research topic which also needs graph reasoning models. Graph neural networks (GNNs) are connectionist models that capture the dependence of graphs via message passing between the nodes of graphs. Unlike standard neural networks, graph neural networks retain a state that can represent information from its neighborhood with arbitrary depth. Although the primitive GNNs have been found difficult to train for a fixed point, recent advances in network architectures, optimization techniques, and parallel computation have enabled successful learning with them. In recent years, systems based on variants of graph neural networks such as graph convolutional network (GCN), graph attention network (GAT), gated graph neural network (GGNN) have demonstrated ground-breaking performance on many tasks mentioned above. In this survey, we provide a detailed review over existing graph neural network models, systematically categorize the applications, and propose four open problems for future research.
In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.
We introduce hyperbolic attention networks to endow neural networks with enough capacity to match the complexity of data with hierarchical and power-law structure. A few recent approaches have successfully demonstrated the benefits of imposing hyperbolic geometry on the parameters of shallow networks. We extend this line of work by imposing hyperbolic geometry on the activations of neural networks. This allows us to exploit hyperbolic geometry to reason about embeddings produced by deep networks. We achieve this by re-expressing the ubiquitous mechanism of soft attention in terms of operations defined for hyperboloid and Klein models. Our method shows improvements in terms of generalization on neural machine translation, learning on graphs and visual question answering tasks while keeping the neural representations compact.
Many resource allocation problems in the cloud can be described as a basic Virtual Network Embedding Problem (VNEP): finding mappings of request graphs (describing the workloads) onto a substrate graph (describing the physical infrastructure). In the offline setting, the two natural objectives are profit maximization, i.e., embedding a maximal number of request graphs subject to the resource constraints, and cost minimization, i.e., embedding all requests at minimal overall cost. The VNEP can be seen as a generalization of classic routing and call admission problems, in which requests are arbitrary graphs whose communication endpoints are not fixed. Due to its applications, the problem has been studied intensively in the networking community. However, the underlying algorithmic problem is hardly understood. This paper presents the first fixed-parameter tractable approximation algorithms for the VNEP. Our algorithms are based on randomized rounding. Due to the flexible mapping options and the arbitrary request graph topologies, we show that a novel linear program formulation is required. Only using this novel formulation the computation of convex combinations of valid mappings is enabled, as the formulation needs to account for the structure of the request graphs. Accordingly, to capture the structure of request graphs, we introduce the graph-theoretic notion of extraction orders and extraction width and show that our algorithms have exponential runtime in the request graphs' maximal width. Hence, for request graphs of fixed extraction width, we obtain the first polynomial-time approximations. Studying the new notion of extraction orders we show that (i) computing extraction orders of minimal width is NP-hard and (ii) that computing decomposable LP solutions is in general NP-hard, even when restricting request graphs to planar ones.