The \emph{local boxicity} of a graph $G$, denoted by $lbox(G)$, is the minimum positive integer $l$ such that $G$ can be obtained using the intersection of $k$ (, where $k \geq l$,) interval graphs where each vertex of $G$ appears as a non-universal vertex in at most $l$ of these interval graphs. Let $G$ be a graph on $n$ vertices having $m$ edges. Let $\Delta$ denote the maximum degree of a vertex in $G$. We show that, (i) $lbox(G) \leq 2^{13\log^{*}{\Delta}} \Delta$. There exist graphs of maximum degree $\Delta$ having a local boxicity of $\Omega(\frac{\Delta}{\log\Delta})$. (ii) $lbox(G) \in O(\frac{n}{\log{n}})$. There exist graphs on $n$ vertices having a local boxicity of $\Omega(\frac{n}{\log n})$. (iii) $lbox(G) \leq (2^{13\log^{*}{\sqrt{m}}} + 2 )\sqrt{m}$. There exist graphs with $m$ edges having a local boxicity of $\Omega(\frac{\sqrt{m}}{\log m})$. (iv) the local boxicity of $G$ is at most its \emph{product dimension}. This connection helps us in showing that the local boxicity of the \emph{Kneser graph} $K(n,k)$ is at most $\frac{k}{2} \log{\log{n}}$. The above results can be extended to the \emph{local dimension} of a partially ordered set due to the known connection between local boxicity and local dimension. Finally, we show that the \emph{cubicity} of a graph on $n$ vertices of girth greater than $g+1$ is $O(n^{\frac{1}{\lfloor g/2\rfloor}}\log n)$.
In Statistical Relational Artificial Intelligence, a branch of AI and machine learning which combines the logical and statistical schools of AI, one uses the concept {\em para\-metrized probabilistic graphical model (PPGM)} to model (conditional) dependencies between random variables and to make probabilistic inferences about events on a space of "possible worlds". The set of possible worlds with underlying domain $D$ (a set of objects) can be represented by the set $\mathbf{W}_D$ of all first-order structures (for a suitable signature) with domain $D$. Using a formal logic we can describe events on $\mathbf{W}_D$. By combining a logic and a PPGM we can also define a probability distribution $\mathbb{P}_D$ on $\mathbf{W}_D$ and use it to compute the probability of an event. We consider a logic, denoted $PLA$, with truth values in the unit interval, which uses aggregation functions, such as arithmetic mean, geometric mean, maximum and minimum instead of quantifiers. However we face the problem of computational efficiency and this problem is an obstacle to the wider use of methods from Statistical Relational AI in practical applications. We address this problem by proving that the described probability will, under certain assumptions on the PPGM and the sentence $\varphi$, converge as the size of $D$ tends to infinity. The convergence result is obtained by showing that every formula $\varphi(x_1, \ldots, x_k)$ which contains only "admissible" aggregation functions (e.g. arithmetic and geometric mean, max and min) is asymptotically equivalent to a formula $\psi(x_1, \ldots, x_k)$ without aggregation functions.
Graph Convolutional Networks (GCNs) are one of the most popular architectures that are used to solve classification problems accompanied by graphical information. We present a rigorous theoretical understanding of the effects of graph convolutions in multi-layer networks. We study these effects through the node classification problem of a non-linearly separable Gaussian mixture model coupled with a stochastic block model. First, we show that a single graph convolution expands the regime of the distance between the means where multi-layer networks can classify the data by a factor of at least $1/\sqrt[4]{\mathbb{E}{\rm deg}}$, where $\mathbb{E}{\rm deg}$ denotes the expected degree of a node. Second, we show that with a slightly stronger graph density, two graph convolutions improve this factor to at least $1/\sqrt[4]{n}$, where $n$ is the number of nodes in the graph. Finally, we provide both theoretical and empirical insights into the performance of graph convolutions placed in different combinations among the layers of a network, concluding that the performance is mutually similar for all combinations of the placement. We present extensive experiments on both synthetic and real-world data that illustrate our results.
We consider statistical models arising from the common set of solutions to a sparse polynomial system with general coefficients. The maximum likelihood degree counts the number of critical points of the likelihood function restricted to the model. We prove the maximum likelihood degree of a sparse polynomial system is determined by its Newton polytopes and equals the mixed volume of a related Lagrange system of equations.
In this paper, we propose a depth-first search (DFS) algorithm for searching maximum matchings in general graphs. Unlike blossom shrinking algorithms, which store all possible alternative alternating paths in the super-vertices shrunk from blossoms, the newly proposed algorithm does not involve blossom shrinking. The basic idea is to deflect the alternating path when facing blossoms. The algorithm maintains detour information in an auxiliary stack to minimize the redundant data structures. A benefit of our technique is to avoid spending time on shrinking and expanding blossoms. This DFS algorithm can determine a maximum matching of a general graph with $m$ edges and $n$ vertices in $O(mn)$ time with space complexity $O(n)$.
We study the problem of testing whether a function $f: \mathbb{R}^n \to \mathbb{R}$ is a polynomial of degree at most $d$ in the \emph{distribution-free} testing model. Here, the distance between functions is measured with respect to an unknown distribution $\mathcal{D}$ over $\mathbb{R}^n$ from which we can draw samples. In contrast to previous work, we do not assume that $\mathcal{D}$ has finite support. We design a tester that given query access to $f$, and sample access to $\mathcal{D}$, makes $(d/\varepsilon)^{O(1)}$ many queries to $f$, accepts with probability $1$ if $f$ is a polynomial of degree $d$, and rejects with probability at least $2/3$ if every degree-$d$ polynomial $P$ disagrees with $f$ on a set of mass at least $\varepsilon$ with respect to $\mathcal{D}$. Our result also holds under mild assumptions when we receive only a polynomial number of bits of precision for each query to $f$, or when $f$ can only be queried on rational points representable using a logarithmic number of bits. Along the way, we prove a new stability theorem for multivariate polynomials that may be of independent interest.
Computing a maximum independent set (MaxIS) is a fundamental NP-hard problem in graph theory, which has important applications in a wide spectrum of fields. Since graphs in many applications are changing frequently over time, the problem of maintaining a MaxIS over dynamic graphs has attracted increasing attention over the past few years. Due to the intractability of maintaining an exact MaxIS, this paper aims to develop efficient algorithms that can maintain an approximate MaxIS with an accuracy guarantee theoretically. In particular, we propose a framework that maintains a $(\frac{\Delta}{2} + 1)$-approximate MaxIS over dynamic graphs and prove that it achieves a constant approximation ratio in many real-world networks. To the best of our knowledge, this is the first non-trivial approximability result for the dynamic MaxIS problem. Following the framework, we implement an efficient linear-time dynamic algorithm and a more effective dynamic algorithm with near-linear expected time complexity. Our thorough experiments over real and synthetic graphs demonstrate the effectiveness and efficiency of the proposed algorithms, especially when the graph is highly dynamic.
Categorical probability has recently seen significant advances through the formalism of Markov categories, within which several classical theorems have been proven in entirely abstract categorical terms. Closely related to Markov categories are gs-monoidal categories, also known as CD categories. These omit a condition that implements the normalization of probability. Extending work of Corradini and Gadducci, we construct free gs-monoidal and free Markov categories generated by a collection of morphisms of arbitrary arity and coarity. For free gs-monoidal categories, this comes in the form of an explicit combinatorial description of their morphisms as structured cospans of labeled hypergraphs. These can be thought of as a formalization of gs-monoidal string diagrams ($=$term graphs) as a combinatorial data structure. We formulate the appropriate $2$-categorical universal property based on ideas of Walters and prove that our categories satisfy it. We expect our free categories to be relevant for computer implementations and we also argue that they can be used as statistical causal models generalizing Bayesian networks.
Many forms of dependence manifest themselves over time, with behavior of variables in dynamical systems as a paradigmatic example. This paper studies temporal dependence in dynamical systems from a logical perspective, by extending a minimal modal base logic of static functional dependencies. We define a logic for dynamical systems with single time steps, provide a complete axiomatic proof calculus, and show the decidability of the satisfiability problem for a substantial fragment. The system comes in two guises: modal and first-order, that naturally complement each other. Next, we consider a timed semantics for our logic, as an intermediate between state spaces and temporal universes for the unfoldings of a dynamical system. We prove completeness and decidability by combining techniques from dynamic-epistemic logic and modal logic of functional dependencies with complex terms for objects. Also, we extend these results to the timed logic with functional symbols and term identity. Finally, we conclude with a brief outlook on how the system proposed here connects with richer temporal logics of system behavior, and with dynamic topological logic.
We study the notion of local treewidth in sparse random graphs: the maximum treewidth over all $k$-vertex subgraphs of an $n$-vertex graph. When $k$ is not too large, we give nearly tight bounds for this local treewidth parameter; we also derive tight bounds for the local treewidth of noisy trees, trees where every non-edge is added independently with small probability. We apply our upper bounds on the local treewidth to obtain fixed parameter tractable algorithms (on random graphs and noisy trees) for edge-removal problems centered around containing a contagious process evolving over a network. In these problems, our main parameter of study is $k$, the number of "infected" vertices in the network. For a certain range of parameters the running time of our algorithms on $n$-vertex graphs is $2^{o(k)}\textrm{poly}(n)$, improving upon the $2^{\Omega(k)}\textrm{poly}(n)$ performance of the best-known algorithms designed for worst-case instances of these edge deletion problems.
In this paper we study the finite sample and asymptotic properties of various weighting estimators of the local average treatment effect (LATE), several of which are based on Abadie (2003)'s kappa theorem. Our framework presumes a binary endogenous explanatory variable ("treatment") and a binary instrumental variable, which may only be valid after conditioning on additional covariates. We argue that one of the Abadie estimators, which we show is weight normalized, is likely to dominate the others in many contexts. A notable exception is in settings with one-sided noncompliance, where certain unnormalized estimators have the advantage of being based on a denominator that is bounded away from zero. We use a simulation study and three empirical applications to illustrate our findings. In applications to causal effects of college education using the college proximity instrument (Card, 1995) and causal effects of childbearing using the sibling sex composition instrument (Angrist and Evans, 1998), the unnormalized estimates are clearly unreasonable, with "incorrect" signs, magnitudes, or both. Overall, our results suggest that (i) the relative performance of different kappa weighting estimators varies with features of the data-generating process; and that (ii) the normalized version of Tan (2006)'s estimator may be an attractive alternative in many contexts. Applied researchers with access to a binary instrumental variable should also consider covariate balancing or doubly robust estimators of the LATE.