PageRank is a popular centrality metric that assigns importance to the vertices of a graph based on its neighbors and their score. Efficient parallel algorithms for updating PageRank on dynamic graphs is crucial for various applications, especially as dataset sizes have reached substantial scales. This technical report presents our Dynamic Frontier approach. Given a batch update of edge deletion and insertions, it progressively identifies affected vertices that are likely to change their ranks with minimal overhead. On a server equipped with a 64-core AMD EPYC-7742 processor, our Dynamic Frontier PageRank outperforms Static, Naive-dynamic, and Dynamic Traversal PageRank by 7.8x, 2.9x, and 3.9x respectively - on uniformly random batch updates of size 10^-7 |E| to 10^-3 |E|. In addition, our approach improves performance at an average rate of 1.8x for every doubling of threads.
Finding a Hamiltonian cycle in a given graph is computationally challenging, and in general remains so even when one is further given one Hamiltonian cycle in the graph and asked to find another. In fact, no significantly faster algorithms are known for finding another Hamiltonian cycle than for finding a first one even in the setting where another Hamiltonian cycle is structurally guaranteed to exist, such as for odd-degree graphs. We identify a graph class -- the bipartite Pfaffian graphs of minimum degree three -- where it is NP-complete to decide whether a given graph in the class is Hamiltonian, but when presented with a Hamiltonian cycle as part of the input, another Hamiltonian cycle can be found efficiently. We prove that Thomason's lollipop method~[Ann.~Discrete Math.,~1978], a well-known algorithm for finding another Hamiltonian cycle, runs in a linear number of steps in cubic bipartite Pfaffian graphs. This was conjectured for cubic bipartite planar graphs by Haddadan [MSc~thesis,~Waterloo,~2015]; in contrast, examples are known of both cubic bipartite graphs and cubic planar graphs where the lollipop method takes exponential time. Beyond the lollipop method, we address a slightly more general graph class and present two algorithms, one running in linear-time and one operating in logarithmic space, that take as input (i) a bipartite Pfaffian graph $G$ of minimum degree three, (ii) a Hamiltonian cycle $H$ in $G$, and (iii) an edge $e$ in $H$, and output at least three other Hamiltonian cycles through the edge $e$ in $G$. We also present further improved algorithms for finding optimal traveling salesperson tours and counting Hamiltonian cycles in bipartite planar graphs with running times that are not known to hold in general planar graphs.
Scene text recognition is a rapidly developing field that faces numerous challenges due to the complexity and diversity of scene text, including complex backgrounds, diverse fonts, flexible arrangements, and accidental occlusions. In this paper, we propose a novel approach called Class-Aware Mask-guided feature refinement (CAM) to address these challenges. Our approach introduces canonical class-aware glyph masks generated from a standard font to effectively suppress background and text style noise, thereby enhancing feature discrimination. Additionally, we design a feature alignment and fusion module to incorporate the canonical mask guidance for further feature refinement for text recognition. By enhancing the alignment between the canonical mask feature and the text feature, the module ensures more effective fusion, ultimately leading to improved recognition performance. We first evaluate CAM on six standard text recognition benchmarks to demonstrate its effectiveness. Furthermore, CAM exhibits superiority over the state-of-the-art method by an average performance gain of 4.1% across six more challenging datasets, despite utilizing a smaller model size. Our study highlights the importance of incorporating canonical mask guidance and aligned feature refinement techniques for robust scene text recognition. The code is available at //github.com/MelosY/CAM.
Error-correcting codes over the real field are studied which can locate outlying computational errors when performing approximate computing of real vector--matrix multiplication on resistive crossbars. Prior work has concentrated on locating a single outlying error and, in this work, several classes of codes are presented which can handle multiple errors. It is first shown that one of the known constructions, which is based on spherical codes, can in fact handle multiple outlying errors. A second family of codes is then presented with $\zeroone$~parity-check matrices which are sparse and disjunct; such matrices have been used in other applications as well, especially in combinatorial group testing. In addition, a certain class of the codes that are obtained through this construction is shown to be efficiently decodable. As part of the study of sparse disjunct matrices, this work also contains improved lower and upper bounds on the maximum Hamming weight of the rows in such matrices.
Due to its optimal complexity, the multigrid (MG) method is one of the most popular approaches for solving large-scale linear systems arising from the discretization of partial differential equations. However, the parallel implementation of standard MG methods, which are inherently multiplicative, suffers from increasing communication complexity. In such cases, the additive variants of MG methods provide a good alternative due to their inherently parallel nature, although they exhibit slower convergence. This work combines the additive multigrid method with the multipreconditioned conjugate gradient (MPCG) method. In the proposed approach, the MPCG method employs the corrections from the different levels of the MG hierarchy as separate preconditioned search directions. In this approach, the MPCG method updates the current iterate by using the linear combination of the preconditioned search directions, where the optimal coefficients for the linear combination are computed by exploiting the energy norm minimization of the CG method. The idea behind our approach is to combine the $A$-conjugacy of the search directions of the MPCG method and the quasi $H_1$-orthogonality of the corrections from the MG hierarchy. In the numerical section, we study the performance of the proposed method compared to the standard additive and multiplicative MG methods used as preconditioners for the CG method.
We propose a new joint mean and correlation regression model for correlated multivariate discrete responses, that simultaneously regresses the mean of each response against a set of covariates, and the correlations between responses against a set of similarity/distance measures. A set of joint estimating equations are formulated to construct an estimator of both the mean regression coefficients and the correlation regression parameters. Under a general setting where the number of responses can tend to infinity, the joint estimator is demonstrated to be consistent and asymptotically normally distributed, with differing rates of convergence due to the mean regression coefficients being heterogeneous across responses. An iterative estimation procedure is developed to obtain parameter estimates in the required, constrained parameter space. We apply the proposed model to a multivariate abundance dataset comprising overdispersed counts of 38 Carabidae ground beetle species sampled throughout Scotland, along with information about the environmental conditions of each site and the traits of each species. Results show in particular that the relationships between the mean abundances of various beetle species and environmental covariates are different and that beetle total length has statistically important effect in driving the correlations between the species. Simulations demonstrate the strong finite sample performance of the proposed estimator in terms of point estimation and inference.
Accurate frictional contact is critical in simulating the assembly of rod-like structures in the practical world, such as knots, hairs, flagella, and more. Due to their high geometric nonlinearity and elasticity, rod-on-rod contact remains a challenging problem tackled by researchers in both computational mechanics and computer graphics. Typically, frictional contact is regarded as constraints for the equations of motions of a system. Such constraints are often computed independently at every time step in a dynamic simulation, thus slowing down the simulation and possibly introducing numerical convergence issues. This paper proposes a fully implicit penalty-based frictional contact method, Implicit Contact Model (IMC), that efficiently and robustly captures accurate frictional contact responses. We showcase our algorithm's performance in achieving visually realistic results for the challenging and novel contact scenario of flagella bundling in fluid medium, a significant phenomenon in biology that motivates novel engineering applications in soft robotics. In addition to this, we offer a side-by-side comparison with Incremental Potential Contact (IPC), a state-of-the-art contact handling algorithm. We show that IMC possesses comparable performance to IPC while converging at a faster rate.
Graphs are important data representations for describing objects and their relationships, which appear in a wide diversity of real-world scenarios. As one of a critical problem in this area, graph generation considers learning the distributions of given graphs and generating more novel graphs. Owing to their wide range of applications, generative models for graphs, which have a rich history, however, are traditionally hand-crafted and only capable of modeling a few statistical properties of graphs. Recent advances in deep generative models for graph generation is an important step towards improving the fidelity of generated graphs and paves the way for new kinds of applications. This article provides an extensive overview of the literature in the field of deep generative models for graph generation. Firstly, the formal definition of deep generative models for the graph generation and the preliminary knowledge are provided. Secondly, taxonomies of deep generative models for both unconditional and conditional graph generation are proposed respectively; the existing works of each are compared and analyzed. After that, an overview of the evaluation metrics in this specific domain is provided. Finally, the applications that deep graph generation enables are summarized and five promising future research directions are highlighted.
Graph Neural Networks (GNNs) have recently become increasingly popular due to their ability to learn complex systems of relations or interactions arising in a broad spectrum of problems ranging from biology and particle physics to social networks and recommendation systems. Despite the plethora of different models for deep learning on graphs, few approaches have been proposed thus far for dealing with graphs that present some sort of dynamic nature (e.g. evolving features or connectivity over time). In this paper, we present Temporal Graph Networks (TGNs), a generic, efficient framework for deep learning on dynamic graphs represented as sequences of timed events. Thanks to a novel combination of memory modules and graph-based operators, TGNs are able to significantly outperform previous approaches being at the same time more computationally efficient. We furthermore show that several previous models for learning on dynamic graphs can be cast as specific instances of our framework. We perform a detailed ablation study of different components of our framework and devise the best configuration that achieves state-of-the-art performance on several transductive and inductive prediction tasks for dynamic graphs.
Named entity recognition (NER) is the task to identify text spans that mention named entities, and to classify them into predefined categories such as person, location, organization etc. NER serves as the basis for a variety of natural language applications such as question answering, text summarization, and machine translation. Although early NER systems are successful in producing decent recognition accuracy, they often require much human effort in carefully designing rules or features. In recent years, deep learning, empowered by continuous real-valued vector representations and semantic composition through nonlinear processing, has been employed in NER systems, yielding stat-of-the-art performance. In this paper, we provide a comprehensive review on existing deep learning techniques for NER. We first introduce NER resources, including tagged NER corpora and off-the-shelf NER tools. Then, we systematically categorize existing works based on a taxonomy along three axes: distributed representations for input, context encoder, and tag decoder. Next, we survey the most representative methods for recent applied techniques of deep learning in new NER problem settings and applications. Finally, we present readers with the challenges faced by NER systems and outline future directions in this area.
Incompleteness is a common problem for existing knowledge graphs (KGs), and the completion of KG which aims to predict links between entities is challenging. Most existing KG completion methods only consider the direct relation between nodes and ignore the relation paths which contain useful information for link prediction. Recently, a few methods take relation paths into consideration but pay less attention to the order of relations in paths which is important for reasoning. In addition, these path-based models always ignore nonlinear contributions of path features for link prediction. To solve these problems, we propose a novel KG completion method named OPTransE. Instead of embedding both entities of a relation into the same latent space as in previous methods, we project the head entity and the tail entity of each relation into different spaces to guarantee the order of relations in the path. Meanwhile, we adopt a pooling strategy to extract nonlinear and complex features of different paths to further improve the performance of link prediction. Experimental results on two benchmark datasets show that the proposed model OPTransE performs better than state-of-the-art methods.