It is well known [Lov\'{a}sz, 1967] that up to isomorphism a graph $G$ is determined by the homomorphism counts $\hom(F, G)$, i.e., the number of homomorphisms from $F$ to $G$, where $F$ ranges over all graphs. Moreover, it suffices that $F$ ranges over the graphs with at most as many vertices as $G$. Thus in principle we can answer any query concerning $G$ with only accessing the $\hom(\cdot,G)$'s instead of $G$ itself. In this paper, we zoom in on those queries that can be answered using a constant number of $\hom(\cdot,G)$ for every graph $G$. We observe that if a query $\varphi$ is expressible as a Boolean combination of universal sentences in first-order logic, then whether a graph $G$ satisfies $\varphi$ can be determined by the vector \[\overrightarrow{\mathrm{hom}}_{F_1, \ldots, F_k}(G):= \big(\mathrm{hom}(F_1, G), \ldots, \mathrm{hom}(F_k, G)\big),\] where the graphs $F_1,\ldots,F_k$ only depend on $\varphi$. This leads to a query algorithm for $\varphi$ that is non-adaptive in the sense that those $F_i$ are independent of the input $G$. On the other hand, we prove that the existence of an isolated vertex, which is not definable by such a $\varphi$ but in first-order logic, cannot be determined by any $\overrightarrow{\mathrm{hom}}_{F_1, \ldots, F_k}(\cdot)$. These results provide a clear delineation of the power of non-adaptive query algorithms with access to a constant number of $\hom(\cdot, G)$'s. For adaptive query algorithms, i.e., algorithms that might access some $\hom(F_{i+1}, G)$ with $F_{i+1}$ depending on $\hom(F_1, G), \ldots, \hom(F_i, G)$, we show that three homomorphism counts $\hom(\cdot,G)$ are both sufficient and in general necessary to determine the graph $G$. In particular, by three adaptive queries we can answer any question on $G$. Moreover, adaptively accessing two $\hom(\cdot, G)$'s is already enough to detect an isolated vertex.
We study the problem of computing the vitality with respect to max flow of edges and vertices in undirected planar graphs, where the vitality of an edge/vertex in a graph with respect to max flow between two fixed vertices $s,t$ is defined as the max flow decrease when the edge/vertex is removed from the graph. We show that the vitality of any $k$ selected edges can be computed in $O(kn + n\log\log n)$ worst-case time, and that a $\delta$ additive approximation of the vitality of all edges with capacity at most $c$ can be computed in $O(\frac{c}{\delta}n +n \log \log n)$ worst-case time, where $n$ is the size of the graph. Similar results are given for the vitality of vertices. All our algorithms work in $O(n)$ space.
We make progress on a generalization of the road (colouring) problem. The road problem was posed by Adler-Goodwyn-Weiss and solved by Trahtman. The generalization was posed, and solved in certain special cases, by Ashley-Marcus-Tuncel. We resolve two new families of cases, of which one generalizes the road problem and follows Trahtman's solution, and the other generalizes a result of Ashley-Marcus-Tuncel with a proof quite different from theirs. Along the way, we prove a universal property for the fiber product of certain graph homomorphisms, which may be of independent interest. We provide polynomial-time algorithms for relevant constructions and decision problems.
Past work shows that one can associate a notion of Shannon entropy to a Dirichlet polynomial, regarded as an empirical distribution. Indeed, entropy can be extracted from any $d\in\mathsf{Dir}$ by a two-step process, where the first step is a rig homomorphism out of $\mathsf{Dir}$, the \emph{set} of Dirichlet polynomials, with rig structure given by standard addition and multiplication. In this short note, we show that this rig homomorphism can be upgraded to a rig \emph{functor}, when we replace the set of Dirichlet polynomials by the \emph{category} of ordinary (Cartesian) polynomials. In the Cartesian case, the process has three steps. The first step is a rig functor $\mathbf{Poly}^{\mathbf{Cart}}\to\mathbf{Poly}$ sending a polynomial $p$ to $\dot{p}\mathcal{y}$, where $\dot{p}$ is the derivative of $p$. The second is a rig functor $\mathbf{Poly}\to\mathbf{Set}\times\mathbf{Set}^{\text{op}}$, sending a polynomial $q$ to the pair $(q(1),\Gamma(q))$, where $\Gamma(q)=\mathbf{Poly}(q,\mathcal{y})$ can be interpreted as the global sections of $q$ viewed as a bundle, and $q(1)$ as its base. To make this precise we define what appears to be a new distributive monoidal structure on $\mathbf{Set}\times\mathbf{Set}^{\text{op}}$, which can be understood geometrically in terms of rectangles. The last step, as for Dirichlet polynomials, is simply to extract the entropy as a real number from a pair of sets $(A,B)$; it is given by $\log A-\log \sqrt[A]{B}$ and can be thought of as the log aspect ratio of the rectangle.
The prefix palindromic length $p_{\mathbf{u}}(n)$ of an infinite word $\mathbf{u}$ is the minimal number of concatenated palindromes needed to express the prefix of length $n$ of $\mathbf{u}$. This function is surprisingly difficult to study; in particular, the conjecture that $p_{\mathbf{u}}(n)$ can be bounded only if $\mathbf{u}$ is ultimately periodic is open since 2013. A more recent conjecture concerns the prefix palindromic length of the period doubling word: it seems that it is not $2$-regular, and if it is true, this would give a rare if not unique example of a non-regular function of a $2$-automatic word. For some other $k$-automatic words, however, the prefix palindromic length is known to be $k$-regular. Here we add to the list of those words the Sierpinski word $\mathbf{s}$ and give a complete description of $p_{\mathbf{s}}(n)$.
We exhibit a randomized algorithm which given a matrix $A\in \mathbb{C}^{n\times n}$ with $\|A\|\le 1$ and $\delta>0$, computes with high probability an invertible $V$ and diagonal $D$ such that $\|A-VDV^{-1}\|\le \delta$ using $O(T_{MM}(n)\log^2(n/\delta))$ arithmetic operations, in finite arithmetic with $O(\log^4(n/\delta)\log n)$ bits of precision. Here $T_{MM}(n)$ is the number of arithmetic operations required to multiply two $n\times n$ complex matrices numerically stably, known to satisfy $T_{MM}(n)=O(n^{\omega+\eta})$ for every $\eta>0$ where $\omega$ is the exponent of matrix multiplication (Demmel et al., Numer. Math., 2007). Our result significantly improves the previously best known provable running times of $O(n^{10}/\delta^2)$ arithmetic operations for diagonalization of general matrices (Armentano et al., J. Eur. Math. Soc., 2018), and (with regards to the dependence on $n$) $O(n^3)$ arithmetic operations for Hermitian matrices (Dekker and Traub, Lin. Alg. Appl., 1971). It is the first algorithm to achieve nearly matrix multiplication time for diagonalization in any model of computation (real arithmetic, rational arithmetic, or finite arithmetic), thereby matching the complexity of other dense linear algebra operations such as inversion and $QR$ factorization up to polylogarithmic factors. The proof rests on two new ingredients. (1) We show that adding a small complex Gaussian perturbation to any matrix splits its pseudospectrum into $n$ small well-separated components. In particular, this implies that the eigenvalues of the perturbed matrix have a large minimum gap, a property of independent interest in random matrix theory. (2) We give a rigorous analysis of Roberts' Newton iteration method (Roberts, Int. J. Control, 1980) for computing the sign function of a matrix in finite arithmetic, itself an open problem in numerical analysis since at least 1986.
The arboricity of a graph is the minimum number of forests required to cover all its edges. In this paper, we examine arboricity from a game-theoretic perspective and investigate cost-sharing in the minimum forest cover problem. We introduce the arboricity game as a cooperative cost game defined on a graph. The players are edges, and the cost of each coalition is the arboricity of the subgraph induced by the coalition. We study properties of the core and propose an efficient algorithm for computing the nucleolus when the core is not empty. In order to compute the nucleolus in the core, we introduce the prime partition which is built on the densest subgraph lattice. The prime partition decomposes the edge set of a graph into a partially ordered set defined from minimal densest minors and their invariant precedence relation. Moreover, edges from the same partition always have the same value in a core allocation. Consequently, when the core is not empty, the prime partition significantly reduces the number of variables and constraints required in the linear programs of Maschler's scheme and allows us to compute the nucleolus in polynomial time. Besides, the prime partition provides a graph decomposition analogous to the celebrated core decomposition and the density-friendly decomposition, which may be of independent interest.
An arc-colored tournament is said to be $k$-spanning for an integer $k\geq 1$ if the union of its arc-color classes of maximal valency at most $k$ is the arc set of a strongly connected digraph. It is proved that isomorphism testing of $k$-spanning tournaments is fixed-parameter tractable.
We prove a bound of $O( k (n+m)\log^{d-1})$ on the number of incidences between $n$ points and $m$ axis parallel boxes in $\mathbb{R}^d$, if no $k$ boxes contain $k$ common points. That is, the incidence graph between the points and the boxes does not contain $K_{k,k}$ as a subgraph. This new bound improves over previous work by a factor of $\log^d n$, for $d >2$. We also study other variants of the problem. For halfspaces, using shallow cuttings, we get a near linear bound in two and three dimensions. Finally, we present near linear bound for the case of shapes in the plane with low union complexity (e.g. fat triangles).
We address the following foundational question: what is the population, and sample, Frechet mean (or median) graph of an ensemble of inhomogeneous Erdos-Renyi random graphs? We prove that if we use the Hamming distance to compute distances between graphs, then the Frechet mean (or median) graph of an ensemble of inhomogeneous random graphs is obtained by thresholding the expected adjacency matrix of the ensemble. We show that the result also holds for the sample mean (or median) when the population expected adjacency matrix is replaced with the sample mean adjacency matrix. Consequently, the Frechet mean (or median) graph of inhomogeneous Erdos-Renyi random graphs exhibits a sharp threshold: it is either the empty graph, or the complete graph. This novel theoretical result has some significant practical consequences; for instance, the Frechet mean of an ensemble of sparse inhomogeneous random graphs is always the empty graph.
We consider the allocation of $m$ balls into $n$ bins with incomplete information. In the classical Two-Choice process a ball first queries the load of two randomly chosen bins and is then placed in the least loaded bin. In our setting, each ball also samples two random bins but can only estimate a bin's load by sending binary queries of the form "Is the load at least the median?" or "Is the load at least 100?". For the lightly loaded case $m=O(n)$, Feldheim and Gurel-Gurevich (2021) showed that with one query it is possible to achieve a maximum load of $O(\sqrt{\log n/\log \log n})$, and posed the question whether a maximum load of $m/n+O(\sqrt{\log n/\log \log n})$ is possible for any $m = \Omega(n)$. In this work, we resolve this open problem by proving a lower bound of $m/n+\Omega( \sqrt{\log n})$ for a fixed $m=\Theta(n \sqrt{\log n})$, and a lower bound of $m/n+\Omega(\log n/\log \log n)$ for some $m$ depending on the used strategy. We complement this negative result by proving a positive result for multiple queries. In particular, we show that with only two binary queries per chosen bin, there is an oblivious strategy which ensures a maximum load of $m/n+O(\sqrt{\log n})$ for any $m \geq 1$. Further, for any number of $k = O(\log \log n)$ binary queries, the upper bound on the maximum load improves to $m/n + O(k(\log n)^{1/k})$ for any $m \geq 1$. Further, this result for $k$ queries implies (i) new bounds for the $(1+\beta)$-process introduced by Peres et al (2015), (ii) new bounds for the graphical balanced allocation process on dense expander graphs, and (iii) the bound of $m/n+O(\log \log n)$ on the maximum load achieved by the Two-Choice process, including the heavily loaded case $m=\Omega(n)$ derived by Berenbrink et al. (2006). One novel aspect of our proofs is the use of multiple super-exponential potential functions, which might be of use in future work.