Consider any locally checkable labeling problem $\Pi$ in rooted regular trees: there is a finite set of labels $\Sigma$, and for each label $x \in \Sigma$ we specify what are permitted label combinations of the children for an internal node of label $x$ (the leaf nodes are unconstrained). This formalism is expressive enough to capture many classic problems studied in distributed computing, including vertex coloring, edge coloring, and maximal independent set. We show that the distributed computational complexity of any such problem $\Pi$ falls in one of the following classes: it is $O(1)$, $\Theta(\log^* n)$, $\Theta(\log n)$, or $n^{\Theta(1)}$ rounds in trees with $n$ nodes (and all of these classes are nonempty). We show that the complexity of any given problem is the same in all four standard models of distributed graph algorithms: deterministic $\mathsf{LOCAL}$, randomized $\mathsf{LOCAL}$, deterministic $\mathsf{CONGEST}$, and randomized $\mathsf{CONGEST}$ model. In particular, we show that randomness does not help in this setting, and the complexity class $\Theta(\log \log n)$ does not exist (while it does exist in the broader setting of general trees). We also show how to systematically determine the complexity class of any such problem $\Pi$, i.e., whether $\Pi$ takes $O(1)$, $\Theta(\log^* n)$, $\Theta(\log n)$, or $n^{\Theta(1)}$ rounds. While the algorithm may take exponential time in the size of the description of $\Pi$, it is nevertheless practical: we provide a freely available implementation of the classifier algorithm, and it is fast enough to classify many problems of interest.
Proofs (sequent calculus, natural deduction) and imperative algorithms (pseudocodes) are two well-known coexisting concepts. Then what is their relationship? Our answer is that \[ imperative\ algorithms\ =\ proofs\ with\ cuts \] This observation leads to a generalization to pseudocodes which we call {\it logical pseudocodes}. It is similar to natural deduction proof of computability logic\cite{Jap03,Jap08}. Each statement in it corresponds to a proof step in natural deduction. Therefore, the merit over pseudocode is that each statement is guaranteed to be correct and safe with respect to the initial specifications. It can also be seen as an extension to computability logic web (\colw) with forward reasoning capability.
In this paper, we propose GT-GDA, a distributed optimization method to solve saddle point problems of the form: $\min_{\mathbf{x}} \max_{\mathbf{y}} \{F(\mathbf{x},\mathbf{y}) :=G(\mathbf{x}) + \langle \mathbf{y}, \overline{P} \mathbf{x} \rangle - H(\mathbf{y})\}$, where the functions $G(\cdot)$, $H(\cdot)$, and the the coupling matrix $\overline{P}$ are distributed over a strongly connected network of nodes. GT-GDA is a first-order method that uses gradient tracking to eliminate the dissimilarity caused by heterogeneous data distribution among the nodes. In the most general form, GT-GDA includes a consensus over the local coupling matrices to achieve the optimal (unique) saddle point, however, at the expense of increased communication. To avoid this, we propose a more efficient variant GT-GDA-Lite that does not incur the additional communication and analyze its convergence in various scenarios. We show that GT-GDA converges linearly to the unique saddle point solution when $G(\cdot)$ is smooth and convex, $H(\cdot)$ is smooth and strongly convex, and the global coupling matrix $\overline{P}$ has full column rank. We further characterize the regime under which GT-GDA exhibits a network topology-independent convergence behavior. We next show the linear convergence of GT-GDA to an error around the unique saddle point, which goes to zero when the coupling cost ${\langle \mathbf y, \overline{P} \mathbf x \rangle}$ is common to all nodes, or when $G(\cdot)$ and $H(\cdot)$ are quadratic. Numerical experiments illustrate the convergence properties and importance of GT-GDA and GT-GDA-Lite for several applications.
We construct a new class of efficient Monte Carlo methods based on continuous-time piecewise deterministic Markov processes (PDMPs) suitable for inference in high dimensional sparse models, i.e. models for which there is prior knowledge that many coordinates are likely to be exactly $0$. This is achieved with the fairly simple idea of endowing existing PDMP samplers with 'sticky' coordinate axes, coordinate planes etc. Upon hitting those subspaces, an event is triggered during which the process sticks to the subspace, this way spending some time in a sub-model. This results in non-reversible jumps between different (sub-)models. While we show that PDMP samplers in general can be made sticky, we mainly focus on the Zig-Zag sampler. The computational efficiency of our method (and implementation) is established through numerical experiments where both the sample size and the dimension of the parameter space are large.
We consider classes of objective functions of cardinality constrained maximization problems for which the greedy algorithm guarantees a constant approximation. We propose the new class of $\gamma$-$\alpha$-augmentable functions and prove that it encompasses several important subclasses, such as functions of bounded submodularity ratio, $\alpha$-augmentable functions, and weighted rank functions of an independence system of bounded rank quotient - as well as additional objective functions for which the greedy algorithm yields an approximation. For this general class of functions, we show a tight bound of $\frac{\alpha}{\gamma}\cdot\frac{\mathrm{e}^\alpha}{\mathrm{e}^\alpha-1}$ on the approximation ratio of the greedy algorithm that tightly interpolates between bounds from the literature for functions of bounded submodularity ratio and for $\alpha$-augmentable functions. In paritcular, as a by-product, we close a gap left in [Math.Prog., 2020] by obtaining a tight lower bound for $\alpha$-augmentable functions for all $\alpha\geq1$. For weighted rank functions of independence systems, our tight bound becomes $\frac{\alpha}{\gamma}$, which recovers the known bound of $1/q$ for independence systems of rank quotient at least $q$.
Let $G=(V,E)$ be an undirected unweighted planar graph. Consider a vector storing the distances from an arbitrary vertex $v$ to all vertices $S = \{ s_1 , s_2 , \ldots , s_k \}$ of a single face in their cyclic order. The pattern of $v$ is obtained by taking the difference between every pair of consecutive values of this vector. In STOC'19, Li and Parter used a VC-dimension argument to show that in planar graphs, the number of distinct patterns, denoted $x$, is only $O(k^3)$. This resulted in a simple compression scheme requiring $\tilde O(\min \{ k^4+|T|, k\cdot |T|\})$ space to encode the distances between $S$ and a subset of terminal vertices $T \subseteq V$. This is known as the Okamura-Seymour metric compression problem. We give an alternative proof of the $x=O(k^3)$ bound that exploits planarity beyond the VC-dimension argument. Namely, our proof relies on cut-cycle duality, as well as on the fact that distances among vertices of $S$ are bounded by $k$. Our method implies the following: (1) An $\tilde{O}(x+k+|T|)$ space compression of the Okamura-Seymour metric, thus improving the compression of Li and Parter to $\tilde O(\min \{k^3+|T|,k \cdot |T| \})$. (2) An optimal $\tilde{O}(k+|T|)$ space compression of the Okamura-Seymour metric, in the case where the vertices of $T$ induce a connected component in $G$. (3) A tight bound of $x = \Theta(k^2)$ for the family of Halin graphs, whereas the VC-dimension argument is limited to showing $x=O(k^3)$.
Let $L$ be a set of $n$ lines in $R^3$ that is contained, when represented as points in the four-dimensional Pl\"ucker space of lines in $R^3$, in an irreducible variety $T$ of constant degree which is \emph{non-degenerate} with respect to $L$ (see below). We show: \medskip \noindent{\bf (1)} If $T$ is two-dimensional, the number of $r$-rich points (points incident to at least $r$ lines of $L$) is $O(n^{4/3+\epsilon}/r^2)$, for $r \ge 3$ and for any $\epsilon>0$, and, if at most $n^{1/3}$ lines of $L$ lie on any common regulus, there are at most $O(n^{4/3+\epsilon})$ $2$-rich points. For $r$ larger than some sufficiently large constant, the number of $r$-rich points is also $O(n/r)$. As an application, we deduce (with an $\epsilon$-loss in the exponent) the bound obtained by Pach and de Zeeuw (2107) on the number of distinct distances determined by $n$ points on an irreducible algebraic curve of constant degree in the plane that is not a line nor a circle. \medskip \noindent{\bf (2)} If $T$ is two-dimensional, the number of incidences between $L$ and a set of $m$ points in $R^3$ is $O(m+n)$. \medskip \noindent{\bf (3)} If $T$ is three-dimensional and nonlinear, the number of incidences between $L$ and a set of $m$ points in $R^3$ is $O\left(m^{3/5}n^{3/5} + (m^{11/15}n^{2/5} + m^{1/3}n^{2/3})s^{1/3} + m + n \right)$, provided that no plane contains more than $s$ of the points. When $s = O(\min\{n^{3/5}/m^{2/5}, m^{1/2}\})$, the bound becomes $O(m^{3/5}n^{3/5}+m+n)$. As an application, we prove that the number of incidences between $m$ points and $n$ lines in $R^4$ contained in a quadratic hypersurface (which does not contain a hyperplane) is $O(m^{3/5}n^{3/5} + m + n)$. The proofs use, in addition to various tools from algebraic geometry, recent bounds on the number of incidences between points and algebraic curves in the plane.
Many papers in the field of integer linear programming (ILP, for short) are devoted to problems of the type $\max\{c^\top x \colon A x = b,\, x \in \mathbb{Z}^n_{\geq 0}\}$, where all the entries of $A,b,c$ are integer, parameterized by the number of rows of $A$ and $\|A\|_{\max}$. This class of problems is known under the name of ILP problems in the standard form, adding the word "bounded" if $x \leq u$, for some integer vector $u$. Recently, many new sparsity, proximity, and complexity results were obtained for bounded and unbounded ILP problems in the standard form. In this paper, we consider ILP problems in the canonical form $$\max\{c^\top x \colon b_l \leq A x \leq b_r,\, x \in \mathbb{Z}^n\},$$ where $b_l$ and $b_r$ are integer vectors. We assume that the integer matrix $A$ has the rank $n$, $(n + m)$ rows, $n$ columns, and parameterize the problem by $m$ and $\Delta(A)$, where $\Delta(A)$ is the maximum of $n \times n$ sub-determinants of $A$, taken in the absolute value. We show that any ILP problem in the standard form can be polynomially reduced to some ILP problem in the canonical form, preserving $m$ and $\Delta(A)$, but the reverse reduction is not always possible. More precisely, we define the class of generalized ILP problems in the standard form, which includes an additional group constraint, and prove the equivalence to ILP problems in the canonical form. We generalize known sparsity, proximity, and complexity bounds for ILP problems in the canonical form. Additionally, sometimes, we strengthen previously known results for ILP problems in the canonical form, and, sometimes, we give shorter proofs. Finally, we consider the special cases of $m \in \{0,1\}$. By this way, we give specialised sparsity, proximity, and complexity bounds for the problems on simplices, Knapsack problems and Subset-Sum problems.
A Boolean network (BN) with $n$ components is a discrete dynamical system described by the successive iterations of a function $f:\{0,1\}^n \to \{0,1\}^n$. This model finds applications in biology, where fixed points play a central role. For example, in genetic regulations, they correspond to cell phenotypes. In this context, experiments reveal the existence of positive or negative influences among components: component $i$ has a positive (resp. negative) influence on component $j$ meaning that $j$ tends to mimic (resp. negate) $i$. The digraph of influences is called signed interaction digraph (SID), and one SID may correspond to a large number of BNs (which is, in average, doubly exponential according to $n$). The present work opens a new perspective on the well-established study of fixed points in BNs. When biologists discover the SID of a BN they do not know, they may ask: given that SID, can it correspond to a BN having at least/at most $k$ fixed points? Depending on the input, we prove that these problems are in $\textrm{P}$ or complete for $\textrm{NP}$, $\textrm{NP}^{\textrm{NP}}$, $\textrm{NP}^{\textrm{#P}}$ or $\textrm{NEXPTIME}$. In particular, we prove that it is $\textrm{NP}$-complete (resp. $\textrm{NEXPTIME}$-complete) to decide if a given SID can correspond to a BN having at least two fixed points (resp. no fixed point).
Given $1\le \ell <k$ and $\delta>0$, let $\textbf{PM}(k,\ell,\delta)$ be the decision problem for the existence of perfect matchings in $n$-vertex $k$-uniform hypergraphs with minimum $\ell$-degree at least $\delta\binom{n-\ell}{k-\ell}$. For $k\ge 3$, the decision problem in general $k$-uniform hypergraphs, equivalently $\textbf{PM}(k,\ell,0)$, is one of Karp's 21 NP-complete problems. Moreover, a reduction of Szyma\'{n}ska showed that $PM(k, \ell, \delta)$ is NP-complete for $\delta < 1-(1-1/k)^{k-\ell}$. A breakthrough by Keevash, Knox and Mycroft [STOC '13] resolved this problem for $\ell=k-1$ by showing that $PM(k, k-1, \delta)$ is in P for $\delta > 1/k$. Based on their result for $\ell=k-1$, Keevash, Knox and Mycroft conjectured that $PM(k, \ell, \delta)$ is in P for every $\delta > 1-(1-1/k)^{k-\ell}$. In this paper it is shown that this decision problem for perfect matchings can be reduced to the study of the minimum $\ell$-degree condition forcing the existence of fractional perfect matchings. That is, we hopefully solve the "computational complexity" aspect of the problem by reducing it to a well-known extremal problem in hypergraph theory. In particular, together with existing results on fractional perfect matchings, this solves the conjecture of Keevash, Knox and Mycroft for $\ell\ge 0.4k$.
In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.