We present a new algorithm for approximating the number of triangles in a graph $G$ whose edges arrive as an arbitrary order stream. If $m$ is the number of edges in $G$, $T$ the number of triangles, $\Delta_E$ the maximum number of triangles which share a single edge, and $\Delta_V$ the maximum number of triangles which share a single vertex, then our algorithm requires space: \[ \widetilde{O}\left(\frac{m}{T}\cdot \left(\Delta_E + \sqrt{\Delta_V}\right)\right) \] Taken with the $\Omega\left(\frac{m \Delta_E}{T}\right)$ lower bound of Braverman, Ostrovsky, and Vilenchik (ICALP 2013), and the $\Omega\left( \frac{m \sqrt{\Delta_V}}{T}\right)$ lower bound of Kallaugher and Price (SODA 2017), our algorithm is optimal up to log factors, resolving the complexity of a classic problem in graph streaming.
Afshani, Barbay and Chan (2017) introduced the notion of instance-optimal algorithm in the order-oblivious setting. An algorithm A is instance-optimal in the order-oblivious setting for a certain class of algorithms A* if the following hold: - A takes as input a sequence of objects from some domain; - for any instance $\sigma$ and any algorithm A' in A*, the runtime of A on $\sigma$ is at most a constant factor removed from the runtime of A' on the worst possible permutation of $\sigma$. If we identify permutations of a sequence as representing the same instance, this essentially states that A is optimal on every possible input (and not only in the worst case). We design instance-optimal algorithms for the problem of reporting, given a bichromatic set of points in the plane S, all pairs consisting of points of different color which span an empty axis-aligned rectangle (or reporting all points which appear in such a pair). This problem has applications for training-set reduction in nearest-neighbour classifiers. It is also related to the problem consisting of finding the decision boundaries of a euclidean nearest-neighbour classifier, for which Bremner et al. (2005) gave an optimal output-sensitive algorithm. By showing the existence of an instance-optimal algorithm in the order-oblivious setting for this problem we push the methods of Afshani et al. closer to their limits by adapting and extending them to a setting which exhibits highly non-local features. Previous problems for which instance-optimal algorithms were proven to exist were based solely on local relationships between points in a set.
For a graph $G$, let $\lambda_2(G)$ denote its second smallest Laplacian eigenvalue. It was conjectured that $\lambda_2(G) + \lambda_2(\overline{G}) \geq 1$, where $\bar{G}$ is the complement of $G$. Here, we prove this conjecture in the general case. Also, we will show that $\max\{\lambda_2(G), \lambda_2(\overline{G})\} \geq 1 - O(n^{-\frac 13})$, where $n$ is the number of vertices of $G$.
In 1998, Reed conjectured that every graph $G$ satisfies $\chi(G) \leq \lceil \frac{1}{2}(\Delta(G) + 1 + \omega(G))\rceil$, where $\chi(G)$ is the chromatic number of $G$, $\Delta(G)$ is the maximum degree of $G$, and $\omega(G)$ is the clique number of $G$. As evidence for his conjecture, he proved an "epsilon version" of it, i.e. that there exists some $\varepsilon > 0$ such that $\chi(G) \leq (1 - \varepsilon)(\Delta(G) + 1) + \varepsilon\omega(G)$. It is natural to ask if Reed's conjecture or an epsilon version of it is true for the list-chromatic number. In this paper we consider a "local version" of the list-coloring version of Reed's conjecture. Namely, we conjecture that if $G$ is a graph with list-assignment $L$ such that for each vertex $v$ of $G$, $|L(v)| \geq \lceil \frac{1}{2}(d(v) + 1 + \omega(v))\rceil$, where $d(v)$ is the degree of $v$ and $\omega(v)$ is the size of the largest clique containing $v$, then $G$ is $L$-colorable. Our main result is that an "epsilon version" of this conjecture is true, under some mild assumptions. Using this result, we also prove a significantly improved lower bound on the density of $k$-critical graphs with clique number less than $k/2$, as follows. For every $\alpha > 0$, if $\varepsilon \leq \frac{\alpha^2}{1350}$, then if $G$ is an $L$-critical graph for some $k$-list-assignment $L$ such that $\omega(G) < (\frac{1}{2} - \alpha)k$ and $k$ is sufficiently large, then $G$ has average degree at least $(1 + \varepsilon)k$. This implies that for every $\alpha > 0$, there exists $\varepsilon > 0$ such that if $G$ is a graph with $\omega(G)\leq (\frac{1}{2} - \alpha)\mathrm{mad}(G)$, where $\mathrm{mad}(G)$ is the maximum average degree of $G$, then $\chi_\ell(G) \leq \left\lceil (1 - \varepsilon)(\mathrm{mad}(G) + 1) + \varepsilon \omega(G)\right\rceil$.
The hardcore model on a graph $G$ with parameter $\lambda>0$ is a probability measure on the collection of all independent sets of $G$, that assigns to each independent set $I$ a probability proportional to $\lambda^{|I|}$. In this paper we consider the problem of estimating the parameter $\lambda$ given a single sample from the hardcore model on a graph $G$. To bypass the computational intractability of the maximum likelihood method, we use the maximum pseudo-likelihood (MPL) estimator, which for the hardcore model has a surprisingly simple closed form expression. We show that for any sequence of graphs $\{G_N\}_{N\geq 1}$, where $G_N$ is a graph on $N$ vertices, the MPL estimate of $\lambda$ is $\sqrt N$-consistent, whenever the graph sequence has uniformly bounded average degree. We then derive sufficient conditions under which the MPL estimate of the activity parameters is $\sqrt N$-consistent given a single sample from a general $H$-coloring model, in which restrictions between adjacent colors are encoded by a constraint graph $H$. We verify the sufficient conditions for models where there is at least one unconstrained color as long as the graph sequence has uniformly bounded average degree. This applies to many $H$-coloring examples such as the Widom-Rowlinson and multi-state hard-core models. On the other hand, for the $q$-coloring model, which falls outside this class, we show that consistent estimation may be impossible even for graphs with bounded average degree. Nevertheless, we show that the MPL estimate is $\sqrt N$-consistent in the $q$-coloring model when $\{G_N\}_{N\geq 1}$ has bounded average double neighborhood. The presence of hard constraints, as opposed to soft constraints, leads to new challenges, and our proofs entail applications of the method of exchangeable pairs as well as combinatorial arguments that employ the probabilistic method.
In this work, we study two simple yet general complexity classes, based on logspace Turing machines, which provide a unifying framework for efficient query evaluation in areas like information extraction and graph databases, among others. We investigate the complexity of three fundamental algorithmic problems for these classes: enumeration, counting and uniform generation of solutions, and show that they have several desirable properties in this respect. Both complexity classes are defined in terms of non-deterministic logspace transducers (NL transducers). For the first class, we consider the case of unambiguous NL transducers, and we prove constant delay enumeration, and both counting and uniform generation of solutions in polynomial time. For the second class, we consider unrestricted NL transducers, and we obtain polynomial delay enumeration, approximate counting in polynomial time, and polynomial-time randomized algorithms for uniform generation. More specifically, we show that each problem in this second class admits a fully polynomial-time randomized approximation scheme (FPRAS) and a polynomial-time Las Vegas algorithm for uniform generation. Interestingly, the key idea to prove these results is to show that the fundamental problem $\text{#NFA}$ admits an FPRAS, where $\text{#NFA}$ is the problem of counting the number of strings of length $n$ (given in unary) accepted by a non-deterministic finite automaton (NFA). While this problem is known to be $\text{#P}$-complete and, more precisely, $\text{SpanL}$-complete, it was open whether this problem admits an FPRAS. In this work, we solve this open problem, and obtain as a welcome corollary that every function in $\text{SpanL}$ admits an FPRAS.
Coloring unit-disk graphs efficiently is an important problem in the global and distributed setting, with applications in radio channel assignment problems when the communication relies on omni-directional antennas of the same power. In this context it is important to bound not only the complexity of the coloring algorithms, but also the number of colors used. In this paper, we consider two natural distributed settings. In the location-aware setting (when nodes know their coordinates in the plane), we give a constant time distributed algorithm coloring any unit-disk graph $G$ with at most $(3+\epsilon)\omega(G)+6$ colors, for any constant $\epsilon>0$, where $\omega(G)$ is the clique number of $G$. This improves upon a classical 3-approximation algorithm for this problem, for all unit-disk graphs whose chromatic number significantly exceeds their clique number. When nodes do not know their coordinates in the plane, we give a distributed algorithm in the LOCAL model that colors every unit-disk graph $G$ with at most $5.68\omega(G)$ colors in $O(2^{\sqrt{\log \log n}})$ rounds. Moreover, when $\omega(G)=O(1)$, the algorithm runs in $O(\log^* n)$ rounds. This algorithm is based on a study of the local structure of unit-disk graphs, which is of independent interest. We conjecture that every unit-disk graph $G$ has average degree at most $4\omega(G)$, which would imply the existence of a $O(\log n)$ round algorithm coloring any unit-disk graph $G$ with (approximatively) $4\omega(G)$ colors.
Let $G$ be a directed graph with $n$ vertices, $m$ edges, and non-negative edge costs. Given $G$, a fixed source vertex $s$, and a positive integer $p$, we consider the problem of computing, for each vertex $t\neq s$, $p$ edge-disjoint paths of minimum total cost from $s$ to $t$ in $G$. Suurballe and Tarjan~[Networks, 1984] solved the above problem for $p=2$ by designing a $O(m+n\log n)$ time algorithm which also computes a sparse \emph{single-source $2$-multipath preserver}, i.e., a subgraph containing $2$ edge-disjoint paths of minimum total cost from $s$ to every other vertex of $G$. The case $p \geq 3$ was left as an open problem. We study the general problem ($p\geq 2$) and prove that any graph admits a sparse single-source $p$-multipath preserver with $p(n-1)$ edges. This size is optimal since the in-degree of each non-root vertex $v$ must be at least $p$. Moreover, we design an algorithm that requires $O(pn^2 (p + \log n))$ time to compute both $p$ edge-disjoint paths of minimum total cost from the source to all other vertices and an optimal-size single-source $p$-multipath preserver. The running time of our algorithm outperforms that of a natural approach that solves $n-1$ single-pair instances using the well-known \emph{successive shortest paths} algorithm by a factor of $\Theta(\frac{m}{np})$ and is asymptotically near optimal if $p=O(1)$ and $m=\Theta(n^2)$. Our results extend naturally to the case of $p$ vertex-disjoint paths.
In this paper, we study the bandits with knapsacks (BwK) problem and develop a primal-dual based algorithm that achieves a problem-dependent logarithmic regret bound. The BwK problem extends the multi-arm bandit (MAB) problem to model the resource consumption associated with playing each arm, and the existing BwK literature has been mainly focused on deriving asymptotically optimal distribution-free regret bounds. We first study the primal and dual linear programs underlying the BwK problem. From this primal-dual perspective, we discover symmetry between arms and knapsacks, and then propose a new notion of sub-optimality measure for the BwK problem. The sub-optimality measure highlights the important role of knapsacks in determining algorithm regret and inspires the design of our two-phase algorithm. In the first phase, the algorithm identifies the optimal arms and the binding knapsacks, and in the second phase, it exhausts the binding knapsacks via playing the optimal arms through an adaptive procedure. Our regret upper bound involves the proposed sub-optimality measure and it has a logarithmic dependence on length of horizon $T$ and a polynomial dependence on $m$ (the numbers of arms) and $d$ (the number of knapsacks). To the best of our knowledge, this is the first problem-dependent logarithmic regret bound for solving the general BwK problem.
We study a variant of Min Cost Flow in which the flow needs to be connected. Specifically, in the Connected Flow problem one is given a directed graph $G$, along with a set of demand vertices $D \subseteq V(G)$ with demands $\mathsf{dem}: D \rightarrow \mathbb{N}$, and costs and capacities for each edge. The goal is to find a minimum cost flow that satisfies the demands, respects the capacities and induces a (strongly) connected subgraph. This generalizes previously studied problems like the (Many Visits) TSP. We study the parameterized complexity of Connected Flow parameterized by $|D|$, the treewidth $tw$ and by vertex cover size $k$ of $G$ and provide: (i) $\mathsf{NP}$-completeness already for the case $|D|=2$ with only unit demands and capacities and no edge costs, and fixed-parameter tractability if there are no capacities, (ii) a fixed-parameter tractable $\mathcal{O}^{\star}(k^{\mathcal{O}(k)})$ time algorithm for the general case, and a kernel of size polynomial in $k$ for the special case of Many Visits TSP, (iii) an $|V(G)|^{\mathcal{O}(tw)}$ time algorithm and a matching $|V(G)|^{o(tw)}$ time conditional lower bound conditioned on the Exponential Time Hypothesis. To achieve some of our results, we significantly extend an approach by Kowalik et al.~[ESA'20].
We show that for the problem of testing if a matrix $A \in F^{n \times n}$ has rank at most $d$, or requires changing an $\epsilon$-fraction of entries to have rank at most $d$, there is a non-adaptive query algorithm making $\widetilde{O}(d^2/\epsilon)$ queries. Our algorithm works for any field $F$. This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix. Our algorithm is the first such algorithm which does not read a submatrix, and instead reads a carefully selected non-adaptive pattern of entries in rows and columns of $A$. We complement our algorithm with a matching query complexity lower bound for non-adaptive testers over any field. We also give tight bounds of $\widetilde{\Theta}(d^2)$ queries in the sensing model for which query access comes in the form of $\langle X_i, A\rangle:=tr(X_i^\top A)$; perhaps surprisingly these bounds do not depend on $\epsilon$. We next develop a novel property testing framework for testing numerical properties of a real-valued matrix $A$ more generally, which includes the stable rank, Schatten-$p$ norms, and SVD entropy. Specifically, we propose a bounded entry model, where $A$ is required to have entries bounded by $1$ in absolute value. We give upper and lower bounds for a wide range of problems in this model, and discuss connections to the sensing model above.