Centrality measures for simple graphs are well-defined and several main-memory algorithms exist for each. Simple graphs are not adequate for modeling complex data sets with multiple entities and relationships. Multilayer networks (MLNs) have been shown to be better suited, but there are very few algorithms for centrality computation directly on MLNs. They are converted (aggregated or collapsed) to simple graphs using Boolean AND or OR operators to compute centrality, which is not only inefficient but incurs a loss of structure and semantics. In this paper, we propose algorithms that compute closeness centrality on an MLN directly using a novel decoupling-based approach. Individual results of layers (or simple graphs) of an MLN are used and a composition function developed to compute the centrality for the MLN. The challenge is to do this accurately and efficiently. However, since these algorithms do not have complete information of the MLN, computing a global measure such as closeness centrality is a challenge. Hence, these algorithms rely on heuristics derived from intuition. The advantage is that this approach lends itself to parallelism and is more efficient compared to the traditional approach. We present two heuristics for composition and experimentally validate accuracy and efficiency on a large number of synthetic and real-world graphs with diverse characteristics.
Although overparameterized models have shown their success on many machine learning tasks, the accuracy could drop on the testing distribution that is different from the training one. This accuracy drop still limits applying machine learning in the wild. At the same time, importance weighting, a traditional technique to handle distribution shifts, has been demonstrated to have less or even no effect on overparameterized models both empirically and theoretically. In this paper, we propose importance tempering to improve the decision boundary and achieve consistently better results for overparameterized models. Theoretically, we justify that the selection of group temperature can be different under label shift and spurious correlation setting. At the same time, we also prove that properly selected temperatures can extricate the minority collapse for imbalanced classification. Empirically, we achieve state-of-the-art results on worst group classification tasks using importance tempering.
We give algorithms for approximating the partition function of the ferromagnetic Potts model on $d$-regular expanding graphs. We require much weaker expansion than in previous works; for example, the expansion exhibited by the hypercube suffices. The main improvements come from a significantly sharper analysis of standard polymer models, using extremal graph theory and applications of Karger's algorithm to counting cuts that may be of independent interest. It is #BIS-hard to approximate the partition function at low temperatures on bounded-degree graphs, so our algorithm can be seen as evidence that hard instances of #BIS are rare. We believe that these methods can shed more light on other important problems such as sub-exponential algorithms for approximate counting problems.
In this paper, we consider decentralized optimization problems where agents have individual cost functions to minimize subject to subspace constraints that require the minimizers across the network to lie in low-dimensional subspaces. This constrained formulation includes consensus or single-task optimization as special cases, and allows for more general task relatedness models such as multitask smoothness and coupled optimization. In order to cope with communication constraints, we propose and study an adaptive decentralized strategy where the agents employ differential randomized quantizers to compress their estimates before communicating with their neighbors. The analysis shows that, under some general conditions on the quantization noise, and for sufficiently small step-sizes $\mu$, the strategy is stable both in terms of mean-square error and average bit rate: by reducing $\mu$, it is possible to keep the estimation errors small (on the order of $\mu$) without increasing indefinitely the bit rate as $\mu\rightarrow 0$. Simulations illustrate the theoretical findings and the effectiveness of the proposed approach, revealing that decentralized learning is achievable at the expense of only a few bits.
Safe navigation is a fundamental challenge in multi-robot systems due to the uncertainty surrounding the future trajectory of the robots that act as obstacles for each other. In this work, we propose a principled data-driven approach where each robot repeatedly solves a finite horizon optimization problem subject to collision avoidance constraints with latter being formulated as distributionally robust conditional value-at-risk (CVaR) of the distance between the agent and a polyhedral obstacle geometry. Specifically, the CVaR constraints are required to hold for all distributions that are close to the empirical distribution constructed from observed samples of prediction error collected during execution. The generality of the approach allows us to robustify against prediction errors that arise under commonly imposed assumptions in both distributed and decentralized settings. We derive tractable finite-dimensional approximations of this class of constraints by leveraging convex and minmax duality results for Wasserstein distributionally robust optimization problems. The effectiveness of the proposed approach is illustrated in a multi-drone navigation setting implemented in Gazebo platform.
The prevalence of employing attention mechanisms has brought along concerns on the interpretability of attention distributions. Although it provides insights about how a model is operating, utilizing attention as the explanation of model predictions is still highly dubious. The community is still seeking more interpretable strategies for better identifying local active regions that contribute the most to the final decision. To improve the interpretability of existing attention models, we propose a novel Bilinear Representative Non-Parametric Attention (BR-NPA) strategy that captures the task-relevant human-interpretable information. The target model is first distilled to have higher-resolution intermediate feature maps. From which, representative features are then grouped based on local pairwise feature similarity, to produce finer-grained, more precise attention maps highlighting task-relevant parts of the input. The obtained attention maps are ranked according to the activity level of the compound feature, which provides information regarding the important level of the highlighted regions. The proposed model can be easily adapted in a wide variety of modern deep models, where classification is involved. Extensive quantitative and qualitative experiments showcase more comprehensive and accurate visual explanations compared to state-of-the-art attention models and visualizations methods across multiple tasks including fine-grained image classification, few-shot classification, and person re-identification, without compromising the classification accuracy. The proposed visualization model sheds imperative light on how neural networks `pay their attention' differently in different tasks.
Knowledge graph embedding, which aims to represent entities and relations as low dimensional vectors (or matrices, tensors, etc.), has been shown to be a powerful technique for predicting missing links in knowledge graphs. Existing knowledge graph embedding models mainly focus on modeling relation patterns such as symmetry/antisymmetry, inversion, and composition. However, many existing approaches fail to model semantic hierarchies, which are common in real-world applications. To address this challenge, we propose a novel knowledge graph embedding model---namely, Hierarchy-Aware Knowledge Graph Embedding (HAKE)---which maps entities into the polar coordinate system. HAKE is inspired by the fact that concentric circles in the polar coordinate system can naturally reflect the hierarchy. Specifically, the radial coordinate aims to model entities at different levels of the hierarchy, and entities with smaller radii are expected to be at higher levels; the angular coordinate aims to distinguish entities at the same level of the hierarchy, and these entities are expected to have roughly the same radii but different angles. Experiments demonstrate that HAKE can effectively model the semantic hierarchies in knowledge graphs, and significantly outperforms existing state-of-the-art methods on benchmark datasets for the link prediction task.
We study the problem of embedding-based entity alignment between knowledge graphs (KGs). Previous works mainly focus on the relational structure of entities. Some further incorporate another type of features, such as attributes, for refinement. However, a vast of entity features are still unexplored or not equally treated together, which impairs the accuracy and robustness of embedding-based entity alignment. In this paper, we propose a novel framework that unifies multiple views of entities to learn embeddings for entity alignment. Specifically, we embed entities based on the views of entity names, relations and attributes, with several combination strategies. Furthermore, we design some cross-KG inference methods to enhance the alignment between two KGs. Our experiments on real-world datasets show that the proposed framework significantly outperforms the state-of-the-art embedding-based entity alignment methods. The selected views, cross-KG inference and combination strategies all contribute to the performance improvement.
The recent proliferation of knowledge graphs (KGs) coupled with incomplete or partial information, in the form of missing relations (links) between entities, has fueled a lot of research on knowledge base completion (also known as relation prediction). Several recent works suggest that convolutional neural network (CNN) based models generate richer and more expressive feature embeddings and hence also perform well on relation prediction. However, we observe that these KG embeddings treat triples independently and thus fail to cover the complex and hidden information that is inherently implicit in the local neighborhood surrounding a triple. To this effect, our paper proposes a novel attention based feature embedding that captures both entity and relation features in any given entity's neighborhood. Additionally, we also encapsulate relation clusters and multihop relations in our model. Our empirical study offers insights into the efficacy of our attention based model and we show marked performance gains in comparison to state of the art methods on all datasets.
How can we estimate the importance of nodes in a knowledge graph (KG)? A KG is a multi-relational graph that has proven valuable for many tasks including question answering and semantic search. In this paper, we present GENI, a method for tackling the problem of estimating node importance in KGs, which enables several downstream applications such as item recommendation and resource allocation. While a number of approaches have been developed to address this problem for general graphs, they do not fully utilize information available in KGs, or lack flexibility needed to model complex relationship between entities and their importance. To address these limitations, we explore supervised machine learning algorithms. In particular, building upon recent advancement of graph neural networks (GNNs), we develop GENI, a GNN-based method designed to deal with distinctive challenges involved with predicting node importance in KGs. Our method performs an aggregation of importance scores instead of aggregating node embeddings via predicate-aware attention mechanism and flexible centrality adjustment. In our evaluation of GENI and existing methods on predicting node importance in real-world KGs with different characteristics, GENI achieves 5-17% higher NDCG@100 than the state of the art.
In this paper, we propose a novel multi-task learning architecture, which incorporates recent advances in attention mechanisms. Our approach, the Multi-Task Attention Network (MTAN), consists of a single shared network containing a global feature pool, together with task-specific soft-attention modules, which are trainable in an end-to-end manner. These attention modules allow for learning of task-specific features from the global pool, whilst simultaneously allowing for features to be shared across different tasks. The architecture can be built upon any feed-forward neural network, is simple to implement, and is parameter efficient. Experiments on the CityScapes dataset show that our method outperforms several baselines in both single-task and multi-task learning, and is also more robust to the various weighting schemes in the multi-task loss function. We further explore the effectiveness of our method through experiments over a range of task complexities, and show how our method scales well with task complexity compared to baselines.