Graphs (networks) are an important tool to model data in different domains. Real-world graphs are usually directed, where the edges have a direction and they are not symmetric. Betweenness centrality is an important index widely used to analyze networks. In this paper, first given a directed network $G$ and a vertex $r \in V(G)$, we propose an exact algorithm to compute betweenness score of $r$. Our algorithm pre-computes a set $\mathcal{RV}(r)$, which is used to prune a huge amount of computations that do not contribute to the betweenness score of $r$. Time complexity of our algorithm depends on $|\mathcal{RV}(r)|$ and it is respectively $\Theta(|\mathcal{RV}(r)|\cdot|E(G)|)$ and $\Theta(|\mathcal{RV}(r)|\cdot|E(G)|+|\mathcal{RV}(r)|\cdot|V(G)|\log |V(G)|)$ for unweighted graphs and weighted graphs with positive weights. $|\mathcal{RV}(r)|$ is bounded from above by $|V(G)|-1$ and in most cases, it is a small constant. Then, for the cases where $\mathcal{RV}(r)$ is large, we present a simple randomized algorithm that samples from $\mathcal{RV}(r)$ and performs computations for only the sampled elements. We show that this algorithm provides an $(\epsilon,\delta)$-approximation to the betweenness score of $r$. Finally, we perform extensive experiments over several real-world datasets from different domains for several randomly chosen vertices as well as for the vertices with the highest betweenness scores. Our experiments reveal that for estimating betweenness score of a single vertex, our algorithm significantly outperforms the most efficient existing randomized algorithms, in terms of both running time and accuracy. Our experiments also reveal that our algorithm improves the existing algorithms when someone is interested in computing betweenness values of the vertices in a set whose cardinality is very small.
Given a graph, the shortest-path problem requires finding a sequence of edges with minimum cumulative length that connects a source vertex to a target vertex. We consider a generalization of this classical problem in which the position of each vertex in the graph is a continuous decision variable, constrained to lie in a corresponding convex set. The length of an edge is then defined as a convex function of the positions of the vertices it connects. Problems of this form arise naturally in motion planning of autonomous vehicles, robot navigation, and even optimal control of hybrid dynamical systems. The price for such a wide applicability is the complexity of this problem, which is easily seen to be NP-hard. Our main contribution is a strong mixed-integer convex formulation based on perspective functions. This formulation has a very tight convex relaxation and makes it possible to efficiently find globally-optimal paths in large graphs and in high-dimensional spaces.
Certain applications that analyze damping effects require the solution of quadratic eigenvalue problems (QEPs). We use refined isogeometric analysis (rIGA) to solve quadratic eigenproblems. rIGA discretization, while conserving desirable properties of maximum-continuity isogeometric analysis (IGA), reduces the interconnection between degrees of freedom by adding low-continuity basis functions. This connectivity reduction in rIGA's algebraic system results in faster matrix LU factorizations when using multifrontal direct solvers. We compare computational costs of rIGA versus those of IGA when employing Krylov eigensolvers to solve quadratic eigenproblems arising in 2D vector-valued multifield problems. For large problem sizes, the eigencomputation cost is governed by the cost of LU factorization, followed by costs of several matrix-vector and vector-vector multiplications, which correspond to Krylov projections. We minimize the computational cost by introducing C^0 and C^1 separators at specific element interfaces for our rIGA generalizations of the curl-conforming Nedelec and divergence-conforming Raviart-Thomas finite elements. Let p be the polynomial degree of basis functions; the LU factorization is up to O((p-1)^2) times faster when using rIGA compared to IGA in the asymptotic regime. Thus, rIGA theoretically improves the total eigencomputation cost by O((p-1)^2) for sufficiently large problem sizes. Yet, in practical cases of moderate-size eigenproblems, the improvement rate deteriorates as the number of computed eigenvalues increases because of multiple matrix-vector and vector-vector operations. Our numerical tests show that rIGA accelerates the solution of quadratic eigensystems by O(p-1) for moderately sized problems when we seek to compute a reasonable number of eigenvalues.
With growing deployment of Internet of Things (IoT) and machine learning (ML) applications, which need to leverage computation on edge and cloud resources, it is important to develop algorithms and tools to place these distributed computations to optimize their performance. We address the problem of optimally placing computations (described as directed acyclic graphs (DAGs)) on a set of machines to maximize the steady-state throughput for pipelined inputs. Traditionally, such optimization has focused on a different metric, minimizing single-shot makespan, and a well-known algorithm is the Heterogeneous Earliest Finish Time (HEFT) algorithm. Maximizing throughput however, is more suitable for many real-time, edge, cloud and IoT applications, we present a different scheduling algorithm, namely Throughput HEFT (TPHEFT). Further, we present two throughput-oriented enhancements which can be applied to any baseline schedule, that we refer to as "node splitting" (SPLIT) and "task duplication" (DUP). In order to implement and evaluate these algorithms, we built new subsystems and plugins for an open-source dispersed computing framework called Jupiter. Experiments with varying DAG structures indicate that: 1) TPHEFT can significantly improve throughput performance compared to HEFT (up to 2.3 times in our experiments), with greater gains when there is less degree of parallelism in the DAG, 2) Node splitting can potentially improve performance over a baseline schedule, with greater gains when there's an imbalanced allocation of computation or inter-task communication, and 3) Task duplication generally gives improvements only when running upon a baseline that places communication over slow links. To our knowledge, this is the first study to present a systematic experimental implementation and exploration of throughput-enhancing techniques for dispersed computing on real testbeds.
We present the quantum algorithm for the Longest Trail Problem. The problem is to search the longest edge-simple path for a graph with $n$ vertexes and $m$ edges. Here edge-simple means no edge occurs in the path twice, but vertexes can occur several times. The running time of our algorithm is $O^*(1.728^m)$.
Recent advances in Transformer models allow for unprecedented sequence lengths, due to linear space and time complexity. In the meantime, relative positional encoding (RPE) was proposed as beneficial for classical Transformers and consists in exploiting lags instead of absolute positions for inference. Still, RPE is not available for the recent linear-variants of the Transformer, because it requires the explicit computation of the attention matrix, which is precisely what is avoided by such methods. In this paper, we bridge this gap and present Stochastic Positional Encoding as a way to generate PE that can be used as a replacement to the classical additive (sinusoidal) PE and provably behaves like RPE. The main theoretical contribution is to make a connection between positional encoding and cross-covariance structures of correlated Gaussian processes. We illustrate the performance of our approach on the Long-Range Arena benchmark and on music generation.
Betweenness centrality (BC) is one of the most used centrality measures for network analysis, which seeks to describe the importance of nodes in a network in terms of the fraction of shortest paths that pass through them. It is key to many valuable applications, including community detection and network dismantling. Computing BC scores on large networks is computationally challenging due to high time complexity. Many approximation algorithms have been proposed to speed up the estimation of BC, which are mainly sampling-based. However, these methods are still prone to considerable execution time on large-scale networks, and their results are often exacerbated when small changes happen to the network structures. In this paper, we focus on identifying nodes with high BC in a graph, since many application scenarios are built upon retrieving nodes with top-k BC. Different from previous heuristic methods, we turn this task into a learning problem and design an encoder-decoder based framework to resolve the problem. More specifcally, the encoder leverages the network structure to encode each node into an embedding vector, which captures the important structural information of the node. The decoder transforms the embedding vector for each node into a scalar, which captures the relative rank of this node in terms of BC. We use the pairwise ranking loss to train the model to identify the orders of nodes regarding their BC. By training on small-scale networks, the learned model is capable of assigning relative BC scores to nodes for any unseen networks, and thus identifying the highly-ranked nodes. Comprehensive experiments on both synthetic and real-world networks demonstrate that, compared to representative baselines, our model drastically speeds up the prediction without noticeable sacrifce in accuracy, and outperforms the state-of-the-art by accuracy on several large real-world networks.
In this paper, from a theoretical perspective, we study how powerful graph neural networks (GNNs) can be for learning approximation algorithms for combinatorial problems. To this end, we first establish a new class of GNNs that can solve strictly a wider variety of problems than existing GNNs. Then, we bridge the gap between GNN theory and the theory of distributed local algorithms to theoretically demonstrate that the most powerful GNN can learn approximation algorithms for the minimum dominating set problem and the minimum vertex cover problem with some approximation ratios and that no GNN can perform better than with these ratios. This paper is the first to elucidate approximation ratios of GNNs for combinatorial problems. Furthermore, we prove that adding coloring or weak-coloring to each node feature improves these approximation ratios. This indicates that preprocessing and feature engineering theoretically strengthen model capabilities.
This paper addresses the problem of formally verifying desirable properties of neural networks, i.e., obtaining provable guarantees that neural networks satisfy specifications relating their inputs and outputs (robustness to bounded norm adversarial perturbations, for example). Most previous work on this topic was limited in its applicability by the size of the network, network architecture and the complexity of properties to be verified. In contrast, our framework applies to a general class of activation functions and specifications on neural network inputs and outputs. We formulate verification as an optimization problem (seeking to find the largest violation of the specification) and solve a Lagrangian relaxation of the optimization problem to obtain an upper bound on the worst case violation of the specification being verified. Our approach is anytime i.e. it can be stopped at any time and a valid bound on the maximum violation can be obtained. We develop specialized verification algorithms with provable tightness guarantees under special assumptions and demonstrate the practical significance of our general verification approach on a variety of verification tasks.
In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.
In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.