We study the problem of matching a string in a labeled graph. Previous research has shown that unless the Orthogonal Vectors Hypothesis (OVH) is false, one cannot solve this problem in strongly sub-quadratic time, nor index the graph in polynomial time to answer queries efficiently (Equi et al. ICALP 2019, SOFSEM 2021). These conditional lower-bounds cover even deterministic graphs with binary alphabet, but there naturally exist also graph classes that are easy to index: E.g. Wheeler graphs (Gagie et al. Theor. Comp. Sci. 2017) cover graphs admitting a Burrows-Wheeler transform -based indexing scheme. However, it is NP-complete to recognize if a graph is a Wheeler graph (Gibney, Thankachan, ESA 2019). We propose an approach to alleviate the construction bottleneck of Wheeler graphs. Rather than starting from an arbitrary graph, we study graphs induced from multiple sequence alignments (MSAs). Elastic degenerate strings (Bernadini et al. SPIRE 2017, ICALP 2019) can be seen as such graphs, and we introduce here their generalization: elastic founder graphs. We first prove that even such induced graphs are hard to index under OVH. Then we introduce two subclasses, repeat-free and semi-repeat-free graphs, that are easy to index. We give a linear time algorithm to construct a repeat-free non-elastic founder graph from a gapless MSA, and (parameterized) near-linear time algorithms to construct semi-repeat-free (repeat-free, respectively) elastic founder graphs from general MSAs. Finally, we show that repeat-free elastic founder graphs admit a reduction to Wheeler graphs in polynomial time.
While operating communication networks adaptively may improve utilization and performance, frequent adjustments also introduce an algorithmic challenge: the re-optimization of traffic engineering solutions is time-consuming and may limit the granularity at which a network can be adjusted. This paper is motivated by question whether the reactivity of a network can be improved by re-optimizing solutions dynamically rather than from scratch, especially if inputs such as link weights do not change significantly. This paper explores to what extent dynamic algorithms can be used to speed up fundamental tasks in network operations. We specifically investigate optimizations related to traffic engineering (namely shortest paths and maximum flow computations), but also consider spanning tree and matching applications. While prior work on dynamic graph algorithms focuses on link insertions and deletions, we are interested in the practical problem of link weight changes. We revisit existing upper bounds in the weight-dynamic model, and present several novel lower bounds on the amortized runtime for recomputing solutions. In general, we find that the potential performance gains depend on the application, and there are also strict limitations on what can be achieved, even if link weights change only slightly.
We study the Maximum Independent Set (MIS) problem under the notion of stability introduced by Bilu and Linial (2010): a weighted instance of MIS is $\gamma$-stable if it has a unique optimal solution that remains the unique optimum under multiplicative perturbations of the weights by a factor of at most $\gamma\geq 1$. The goal then is to efficiently recover the unique optimal solution. In this work, we solve stable instances of MIS on several graphs classes: we solve $\widetilde{O}(\Delta/\sqrt{\log \Delta})$-stable instances on graphs of maximum degree $\Delta$, $(k - 1)$-stable instances on $k$-colorable graphs and $(1 + \varepsilon)$-stable instances on planar graphs. For general graphs, we present a strong lower bound showing that there are no efficient algorithms for $O(n^{\frac{1}{2} - \varepsilon})$-stable instances of MIS, assuming the planted clique conjecture. We also give an algorithm for $(\varepsilon n)$-stable instances. As a by-product of our techniques, we give algorithms and lower bounds for stable instances of Node Multiway Cut. Furthermore, we prove a general result showing that the integrality gap of convex relaxations of several maximization problems reduces dramatically on stable instances. Moreover, we initiate the study of certified algorithms, a notion recently introduced by Makarychev and Makarychev (2018), which is a class of $\gamma$-approximation algorithms that satisfy one crucial property: the solution returned is optimal for a perturbation of the original instance. We obtain $\Delta$-certified algorithms for MIS on graphs of maximum degree $\Delta$, and $(1+\varepsilon)$-certified algorithms on planar graphs. Finally, we analyze the algorithm of Berman and Furer (1994) and prove that it is a $\left(\frac{\Delta + 1}{3} + \varepsilon\right)$-certified algorithm for MIS on graphs of maximum degree $\Delta$ where all weights are equal to 1.
We consider the product of determinantal point processes (DPPs), a point process whose probability mass is proportional to the product of principal minors of multiple matrices, as a natural, promising generalization of DPPs. We study the computational complexity of computing its normalizing constant, which is among the most essential probabilistic inference tasks. Our complexity-theoretic results (almost) rule out the existence of efficient algorithms for this task unless the input matrices are forced to have favorable structures. In particular, we prove the following: (1) Computing $\sum_S\det({\bf A}_{S,S})^p$ exactly for every (fixed) positive even integer $p$ is UP-hard and Mod$_3$P-hard, which gives a negative answer to an open question posed by Kulesza and Taskar. (2) $\sum_S\det({\bf A}_{S,S})\det({\bf B}_{S,S})\det({\bf C}_{S,S})$ is NP-hard to approximate within a factor of $2^{O(|I|^{1-\epsilon})}$ or $2^{O(n^{1/\epsilon})}$ for any $\epsilon>0$, where $|I|$ is the input size and $n$ is the order of the input matrix. This result is stronger than the #P-hardness for the case of two matrices derived by Gillenwater. (3) There exists a $k^{O(k)}n^{O(1)}$-time algorithm for computing $\sum_S\det({\bf A}_{S,S})\det({\bf B}_{S,S})$, where $k$ is the maximum rank of $\bf A$ and $\bf B$ or the treewidth of the graph formed by nonzero entries of $\bf A$ and $\bf B$. Such parameterized algorithms are said to be fixed-parameter tractable. These results can be extended to the fixed-size case. Further, we present two applications of fixed-parameter tractable algorithms given a matrix $\bf A$ of treewidth $w$: (4) We can compute a $2^{\frac{n}{2p-1}}$-approximation to $\sum_S\det({\bf A}_{S,S})^p$ for any fractional number $p>1$ in $w^{O(wp)}n^{O(1)}$ time. (5) We can find a $2^{\sqrt n}$-approximation to unconstrained MAP inference in $w^{O(w\sqrt n)}n^{O(1)}$ time.
We present a general technique, based on parametric search with some twist, for solving a variety of optimization problems on a set of points in the plane or in higher dimensions. These problems include (i) the reverse shortest path problem in unit-disk graphs, recently studied by Wang and Zhao, (ii) the same problem for weighted unit-disk graphs, with a decision procedure recently provided by Wang and Xue, (iii) extensions of these problems to three and higher dimensions, (iv) the discrete Fr\'echet distance with one-sided shortcuts in higher dimensions, extending the study by Ben Avraham et al., and (v) the maximum-height independent towers problem, in which we want to erect vertical towers of maximum height over a 1.5-dimensional terrain so that no pair of tower tips are mutually visible. We obtain significantly improved solutions for problems (i) and (ii), and new efficient solutions to problems (iii), (iv) and (v), which do not appear to have been studied earlier.
Motivated by recent increased interest in optimization algorithms for non-convex optimization in application to training deep neural networks and other optimization problems in data analysis, we give an overview of recent theoretical results on global performance guarantees of optimization algorithms for non-convex optimization. We start with classical arguments showing that general non-convex problems could not be solved efficiently in a reasonable time. Then we give a list of problems that can be solved efficiently to find the global minimizer by exploiting the structure of the problem as much as it is possible. Another way to deal with non-convexity is to relax the goal from finding the global minimum to finding a stationary point or a local minimum. For this setting, we first present known results for the convergence rates of deterministic first-order methods, which are then followed by a general theoretical analysis of optimal stochastic and randomized gradient schemes, and an overview of the stochastic first-order methods. After that, we discuss quite general classes of non-convex problems, such as minimization of $\alpha$-weakly-quasi-convex functions and functions that satisfy Polyak--Lojasiewicz condition, which still allow obtaining theoretical convergence guarantees of first-order methods. Then we consider higher-order and zeroth-order/derivative-free methods and their convergence rates for non-convex optimization problems.
Given an undirected graph $G=(V,E)$, a vertex $v\in V$ is edge-vertex (ev) dominated by an edge $e\in E$ if $v$ is either incident to $e$ or incident to an adjacent edge of $e$. A set $S^{ev}\subseteq E$ is an edge-vertex dominating set (referred to as ev-dominating set) of $G$ if every vertex of $G$ is ev-dominated by at least one edge of $S^{ev}$. The minimum cardinality of an ev-dominating set is the ev-domination number. The edge-vertex dominating set problem is to find a minimum ev-domination number. In this paper we prove that the ev-dominating set problem is {\tt NP-hard} on unit disk graphs. We also prove that this problem admits a polynomial-time approximation scheme on unit disk graphs. Finally, we give a simple 5-factor linear-time approximation algorithm.
We consider Broyden's method and some accelerated schemes for nonlinear equations having a strongly regular singularity of first order with a one-dimensional nullspace. Our two main results are as follows. First, we show that the use of a preceding Newton-like step ensures convergence for starting points in a starlike domain with density 1. This extends the domain of convergence of these methods significantly. Second, we establish that the matrix updates of Broyden's method converge q-linearly with the same asymptotic factor as the iterates. This contributes to the long-standing question whether the Broyden matrices converge by showing that this is indeed the case for the setting at hand. Furthermore, we prove that the Broyden directions violate uniform linear independence, which implies that existing results for convergence of the Broyden matrices cannot be applied. Numerical experiments of high precision confirm the enlarged domain of convergence, the q-linear convergence of the matrix updates, and the lack of uniform linear independence. In addition, they suggest that these results can be extended to singularities of higher order and that Broyden's method can converge r-linearly without converging q-linearly. The underlying code is freely available.
Task and motion planning problems in robotics typically combine symbolic planning over discrete task variables with motion optimization over continuous state and action variables, resulting in trajectories that satisfy the logical constraints imposed on the task variables. Symbolic planning can scale exponentially with the number of task variables, so recent works such as PDDLStream have focused on optimistic planning with an incrementally growing set of objects and facts until a feasible trajectory is found. However, this set is exhaustively and uniformly expanded in a breadth-first manner, regardless of the geometric structure of the problem at hand, which makes long-horizon reasoning with large numbers of objects prohibitively time-consuming. To address this issue, we propose a geometrically informed symbolic planner that expands the set of objects and facts in a best-first manner, prioritized by a Graph Neural Network based score that is learned from prior search computations. We evaluate our approach on a diverse set of problems and demonstrate an improved ability to plan in large or difficult scenarios. We also apply our algorithm on a 7DOF robotic arm in several block-stacking manipulation tasks.
Let $G$ be a strongly connected directed graph and $u,v,w\in V(G)$ be three vertices. Then $w$ strongly resolves $u$ to $v$ if there is a shortest $u$-$w$-path containing $v$ or a shortest $w$-$v$-path containing $u$. A set $R\subseteq V(G)$ of vertices is a strong resolving set for a directed graph $G$ if for every pair of vertices $u,v\in V(G)$ there is at least one vertex in $R$ that strongly resolves $u$ to $v$ and at least one vertex in $R$ that strongly resolves $v$ to $u$. The distances of the vertices of $G$ to and from the vertices of a strong resolving set $R$ uniquely define the connectivity structure of the graph. The Strong Metric Dimension of a directed graph $G$ is the size of a smallest strong resolving set for $G$. The decision problem Strong Metric Dimension is the question whether $G$ has a strong resolving set of size at most $r$, for a given directed graph $G$ and a given number $r$. In this paper we study undirected and directed co-graphs and introduce linear time algorithms for Strong Metric Dimension. These algorithms can also compute strong resolving sets for co-graphs in linear time.
Knowledge Graph (KG) embedding is a fundamental problem in data mining research with many real-world applications. It aims to encode the entities and relations in the graph into low dimensional vector space, which can be used for subsequent algorithms. Negative sampling, which samples negative triplets from non-observed ones in the training data, is an important step in KG embedding. Recently, generative adversarial network (GAN), has been introduced in negative sampling. By sampling negative triplets with large scores, these methods avoid the problem of vanishing gradient and thus obtain better performance. However, using GAN makes the original model more complex and hard to train, where reinforcement learning must be used. In this paper, motivated by the observation that negative triplets with large scores are important but rare, we propose to directly keep track of them with the cache. However, how to sample from and update the cache are two important questions. We carefully design the solutions, which are not only efficient but also achieve a good balance between exploration and exploitation. In this way, our method acts as a "distilled" version of previous GA-based methods, which does not waste training time on additional parameters to fit the full distribution of negative triplets. The extensive experiments show that our method can gain significant improvement in various KG embedding models, and outperform the state-of-the-art negative sampling methods based on GAN.