Given a boolean predicate $\Pi$ on labeled networks (e.g., proper coloring, leader election, etc.), a self-stabilizing algorithm for $\Pi$ is a distributed algorithm that can start from any initial configuration of the network (i.e., every node has an arbitrary value assigned to each of its variables), and eventually converge to a configuration satisfying $\Pi$. It is known that leader election does not have a deterministic self-stabilizing algorithm using a constant-size register at each node, i.e., for some networks, some of their nodes must have registers whose sizes grow with the size $n$ of the networks. On the other hand, it is also known that leader election can be solved by a deterministic self-stabilizing algorithm using registers of $O(\log \log n)$ bits per node in any $n$-node bounded-degree network. We show that this latter space complexity is optimal. Specifically, we prove that every deterministic self-stabilizing algorithm solving leader election must use $\Omega(\log \log n)$-bit per node registers in some $n$-node networks. In addition, we show that our lower bounds go beyond leader election, and apply to all problems that cannot be solved by anonymous algorithms.
We are interested in the optimization of convex domains under a PDE constraint. Due to the difficulties of approximating convex domains in $\mathbb{R}^3$, the restriction to rotationally symmetric domains is used to reduce shape optimization problems to a two-dimensional setting. For the optimization of an eigenvalue arising in a problem of optimal insulation, the existence of an optimal domain is proven. An algorithm is proposed that can be applied to general shape optimization problems under the geometric constraints of convexity and rotational symmetry. The approximated optimal domains for the eigenvalue problem in optimal insulation are discussed.
A $(1+\epsilon)$-approximate distance oracle of an edge-weighted graph is a data structure that returns an approximate shortest path distance between any two query vertices up to a $(1+\epsilon)$ factor. Thorup (FOCS 2001, JACM 2004) and Klein (SODA 2002) independently constructed a $(1+\epsilon)$-approximate distance oracle with $O(n\log n)$ space, measured in number of words, and $O(1)$ query time when $G$ is an undirected planar graph with $n$ vertices and $\epsilon$ is a fixed constant. Many follow-up works gave $(1+\epsilon)$-approximate distance oracles with various trade-offs between space and query time. However, improving $O(n\log n)$ space bound without sacrificing query time remains an open problem for almost two decades. In this work, we resolve this problem affirmatively by constructing a $(1+\epsilon)$-approximate distance oracle with optimal $O(n)$ space and $O(1)$ query time for undirected planar graphs and fixed $\epsilon$. We also make substantial progress for planar digraphs with non-negative edge weights. For fixed $\epsilon > 0$, we give a $(1+\epsilon)$-approximate distance oracle with space $o(n\log(Nn))$ and $O(\log\log(Nn)$ query time; here $N$ is the ratio between the largest and smallest positive edge weight. This improves Thorup's (FOCS 2001, JACM 2004) $O(n\log(Nn)\log n)$ space bound by more than a logarithmic factor while matching the query time of his structure. This is the first improvement for planar digraphs in two decades, both in the weighted and unweighted setting.
In this paper, we develop deterministic fully dynamic algorithms for computing approximate distances in a graph with worst-case update time guarantees. In particular we obtain improved dynamic algorithms that, given an unweighted and undirected graph $G=(V,E)$ undergoing edge insertions and deletions, and a parameter $0 < \epsilon \leq 1$, maintain $(1+\epsilon)$-approximations of the $st$ distance of a single pair of nodes, the distances from a single source to all nodes ("SSSP"), the distances from multiple sources to all nodes ("MSSP''), or the distances between all nodes ("APSP"). Our main result is a deterministic algorithm for maintaining $(1+\epsilon)$-approximate single-source distances with worst-case update time $O(n^{1.529})$ (for the current best known bound on the matrix multiplication coefficient $\omega$). This matches a conditional lower bound by [BNS, FOCS 2019]. We further show that we can go beyond this SSSP bound for the problem of maintaining approximate $st$ distances by providing a deterministic algorithm with worst-case update time $O(n^{1.447})$. This even improves upon the fastest known randomized algorithm for this problem. At the core, our approach is to combine algebraic distance maintenance data structures with near-additive emulator constructions. This also leads to novel dynamic algorithms for maintaining $(1+\epsilon, \beta)$-emulators that improve upon the state of the art, which might be of independent interest. Our techniques also lead to improvements for randomized approximate diameter maintenance.
We consider an important generalization of the Steiner tree problem, the \emph{Steiner forest problem}, in the Euclidean plane: the input is a multiset $X \subseteq \mathbb{R}^2$, partitioned into $k$ color classes $C_1, C_2, \ldots, C_k \subseteq X$. The goal is to find a minimum-cost Euclidean graph $G$ such that every color class $C_i$ is connected in $G$. We study this Steiner forest problem in the streaming setting, where the stream consists of insertions and deletions of points to $X$. Each input point $x\in X$ arrives with its color $\textsf{color}(x) \in [k]$, and as usual for dynamic geometric streams, the input points are restricted to the discrete grid $\{0, \ldots, \Delta\}^2$. We design a single-pass streaming algorithm that uses $\mathrm{poly}(k \cdot \log\Delta)$ space and time, and estimates the cost of an optimal Steiner forest solution within ratio arbitrarily close to the famous Euclidean Steiner ratio $\alpha_2$ (currently $1.1547 \le \alpha_2 \le 1.214$). This approximation guarantee matches the state of the art bound for streaming Steiner tree, i.e., when $k=1$. Our approach relies on a novel combination of streaming techniques, like sampling and linear sketching, with the classical Arora-style dynamic-programming framework for geometric optimization problems, which usually requires large memory and has so far not been applied in the streaming setting. We complement our streaming algorithm for the Steiner forest problem with simple arguments showing that any finite approximation requires $\Omega(k)$ bits of space.
The ferromagnetic Ising model is a model of a magnetic material and a central topic in statistical physics. It also plays a starring role in the algorithmic study of approximate counting: approximating the partition function of the ferromagnetic Ising model with uniform external field is tractable at all temperatures and on all graphs, due to the randomized algorithm of Jerrum and Sinclair. Here we show that hidden inside the model are hard computational problems. For the class of bounded-degree graphs we find computational thresholds for the approximate counting and sampling problems for the ferromagnetic Ising model at fixed magnetization (that is, fixing the number of $+1$ and $-1$ spins). In particular, letting $\beta_c(\Delta)$ denote the critical inverse temperature of the zero-field Ising model on the infinite $\Delta$-regular tree, and $\eta_{\Delta,\beta,1}^+$ denote the mean magnetization of the zero-field $+$ measure on the infinite $\Delta$-regular tree at inverse temperature $\beta$, we prove, for the class of graphs of maximum degree $\Delta$: 1. For $\beta < \beta_c(\Delta)$ there is an FPRAS and efficient sampling scheme for the fixed-magnetization Ising model for all magnetizations $\eta$. 2. For $\beta > \beta_c(\Delta)$, there is an FPRAS and efficient sampling scheme for the fixed-magnetization Ising model for magnetizations $\eta$ such that $|\eta| >\eta_{\Delta,\beta,1}^+ $. 3. For $\beta > \beta_c(\Delta)$, there is no FPRAS for the fixed-magnetization Ising model for magnetizations $\eta$ such that $|\eta| <\eta_{\Delta,\beta,1}^+ $ unless NP=RP\@.
In genome rearrangements, the mutational event transposition swaps two adjacent blocks of genes in one chromosome. The Transposition Distance Problem (TDP) aims to find the minimum number of transpositions required to transform one chromosome into another, both represented as permutations. The TDP can be reduced to the problem of Sorting by Transpositions (SBT). SBT is $\mathcal{NP}$-hard and the best approximation algorithm with a $1.375$ ratio was proposed by Elias and Hartman. Their algorithm employs simplification, a technique used to transform an input permutation $\pi$ into a simple permutation $\hat{\pi}$, presumably easier to handle with. The permutation $\hat{\pi}$ is obtained by inserting new symbols into $\pi$ in a way that the lower bound of the transposition distance of $\pi$ is kept on $\hat{\pi}$. The simplification is guaranteed to keep the lower bound, not the transposition distance. In this paper, we first show that the algorithm of Elias and Hartman (EH algorithm) may require one extra transposition above the approximation ratio of $1.375$, depending on how the input permutation is simplified. Next, using an algebraic approach, we propose a new upper bound for the transposition distance and a new $1.375$-approximation algorithm to solve SBT skipping simplification and ensuring the approximation ratio of $1.375$ for all $S_n$. We implemented our algorithm and EH's. Regarding the implementation of the EH algorithm, two issues needed to be fixed. We tested both algorithms against all permutations of size $n$, $2\leq n \leq 12$. The results show that the EH algorithm exceeds the approximation ratio of $1.375$ for permutations with a size greater than $7$. Finally, we investigate the performance of both implementations on longer permutations of maximum length $500$.
This work considers the economic dispatch problem for a single micro-gas turbine, governed by a discrete state-space model, under combined heat and power (CHP) operation and coupled with a utility. If the exact power and heat demands are given, existing algorithms can be used to give a quick optimal solution to the economic dispatch problem. However, in practice, the power and heat demands can not be known deterministically, but are rather predicted, resulting in an estimate and a bound on the estimation error. We consider the case in which the power and heat demands are unknown, and present a robust optimization-based approach for scheduling the turbine's heat and power generation, in which the demand is assumed to be inside an uncertainty set. We consider two different choices of the uncertainty set relying on the $\ell^\infty$- and the $\ell^1$-norms, each with different advantages, and consider the associated robust economic dispatch problems. We recast these as robust shortest-path problems on appropriately defined graphs. For the first choice, we provide an exact linear-time algorithm for the solution of the robust shortest-path problem, and for the second, we provide an exact quadratic-time algorithm and an approximate linear-time algorithm. The efficiency and usefulness of the algorithms are demonstrated using a detailed case study that employs real data on energy demand profiles and electricity tariffs.
We study the $c$-approximate near neighbor problem under the continuous Fr\'echet distance: Given a set of $n$ polygonal curves with $m$ vertices, a radius $\delta > 0$, and a parameter $k \leq m$, we want to preprocess the curves into a data structure that, given a query curve $q$ with $k$ vertices, either returns an input curve with Fr\'echet distance at most $c\cdot \delta$ to $q$, or returns that there exists no input curve with Fr\'echet distance at most $\delta$ to $q$. We focus on the case where the input and the queries are one-dimensional polygonal curves -- also called time series -- and we give a comprehensive analysis for this case. We obtain new upper bounds that provide different tradeoffs between approximation factor, preprocessing time, and query time. Our data structures improve upon the state of the art in several ways. We show that for any $0 < \varepsilon \leq 1$ an approximation factor of $(1+\varepsilon)$ can be achieved within the same asymptotic time bounds as the previously best result for $(2+\varepsilon)$. Moreover, we show that an approximation factor of $(2+\varepsilon)$ can be obtained by using preprocessing time and space $O(nm)$, which is linear in the input size, and query time in $O(\frac{1}{\varepsilon})^{k+2}$, where the previously best result used preprocessing time in $n \cdot O(\frac{m}{\varepsilon k})^k$ and query time in $O(1)^k$. We complement our upper bounds with matching conditional lower bounds based on the Orthogonal Vectors Hypothesis. Interestingly, some of our lower bounds already hold for any super-constant value of $k$. This is achieved by proving hardness of a one-sided sparse version of the Orthogonal Vectors problem as an intermediate problem, which we believe to be of independent interest.
We propose a new method of estimation in topic models, that is not a variation on the existing simplex finding algorithms, and that estimates the number of topics K from the observed data. We derive new finite sample minimax lower bounds for the estimation of A, as well as new upper bounds for our proposed estimator. We describe the scenarios where our estimator is minimax adaptive. Our finite sample analysis is valid for any number of documents (n), individual document length (N_i), dictionary size (p) and number of topics (K), and both p and K are allowed to increase with n, a situation not handled well by previous analyses. We complement our theoretical results with a detailed simulation study. We illustrate that the new algorithm is faster and more accurate than the current ones, although we start out with a computational and theoretical disadvantage of not knowing the correct number of topics K, while we provide the competing methods with the correct value in our simulations.
In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.