We prove discrete Helly-type theorems for pseudohalfplanes, which extend recent results of Jensen, Joshi and Ray about halfplanes. Among others we show that given a family of pseudohalfplanes $\cal H$ and a set of points $P$, if every triple of pseudohalfplanes has a common point in $P$ then there exists a set of at most two points that hits every pseudohalfplane of $\cal H$. We also prove that if every triple of points of $P$ is contained in a pseudohalfplane of $\cal H$ then there are two pseudohalfplanes of $\cal H$ that cover all points of $P$. To prove our results we regard pseudohalfplane hypergraphs, define their extremal vertices and show that these behave in many ways as points on the boundary of the convex hull of a set of points. Our methods are purely combinatorial. In addition we determine the maximal possible chromatic number of the regarded hypergraph families.
We present a new data structure to approximate accurately and efficiently a polynomial $f$ of degree $d$ given as a list of coefficients. Its properties allow us to improve the state-of-the-art bounds on the bit complexity for the problems of root isolation and approximate multipoint evaluation. This data structure also leads to a new geometric criterion to detect ill-conditioned polynomials, implying notably that the standard condition number of the zeros of a polynomial is at least exponential in the number of roots of modulus less than $1/2$ or greater than $2$.Given a polynomial $f$ of degree $d$ with $\|f\|_1 \leq 2^\tau$ for $\tau \geq 1$, isolating all its complex roots or evaluating it at $d$ points can be done with a quasi-linear number of arithmetic operations. However, considering the bit complexity, the state-of-the-art algorithms require at least $d^{3/2}$ bit operations even for well-conditioned polynomials and when the accuracy required is low. Given a positive integer $m$, we can compute our new data structure and evaluate $f$ at $d$ points in the unit disk with an absolute error less than $2^{-m}$ in $\widetilde O(d(\tau+m))$ bit operations, where $\widetilde O(\cdot)$ means that we omit logarithmic factors. We also show that if $\kappa$ is the absolute condition number of the zeros of $f$, then we can isolate all the roots of $f$ in $\widetilde O(d(\tau + \log \kappa))$ bit operations. Moreover, our algorithms are simple to implement. For approximating the complex roots of a polynomial, we implemented a small prototype in \verb|Python/NumPy| that is an order of magnitude faster than the state-of-the-art solver \verb/MPSolve/ for high degree polynomials with random coefficients.
We propose a new representation of $k$-partite, $k$-uniform hypergraphs (i.e. a hypergraph with a partition of vertices into $k$ parts such that each hyperedge contains exactly one vertex of each type; we call them $k$-hypergraphs for short) by a finite set $P$ of points in $\mathbb{R}^d$ and a parameter $\ell\leq d-1$. Each point in $P$ is covered by $k={d\choose\ell}$ many axis-aligned affine $\ell$-dimensional subspaces of $\mathbb{R}^d$, which we call $\ell$-subspaces for brevity. We interpret each point in $P$ as a hyperedge that contains each of the covering $\ell$-subspaces as a vertex. The class of $(d,\ell)$-hypergraphs is the class of $k$-hypergraphs that can be represented in this way, where $k={d\choose\ell}$. The resulting classes of hypergraphs are fairly rich: Every $k$-hypergraph is a $(k,k-1)$-hypergraph. On the other hand, $(d,\ell)$-hypergraphs form a proper subclass of the class of all $d\choose\ell$-hypergraphs for $\ell<d-1$. In this paper we give a natural structural characterization of $(d,\ell)$-hypergraphs based on vertex cuts. This characterization leads to a polynomial-time recognition algorithm that decides for a given $d\choose\ell$-hypergraph whether or not it is a $(d,\ell)$-hypergraph and that computes a representation if existing. We assume that the dimension $d$ is constant and that the partitioning of the vertex set is prescribed.
We consider Broyden's method and some accelerated schemes for nonlinear equations having a strongly regular singularity of first order with a one-dimensional nullspace. Our two main results are as follows. First, we show that the use of a preceding Newton-like step ensures convergence for starting points in a starlike domain with density 1. This extends the domain of convergence of these methods significantly. Second, we establish that the matrix updates of Broyden's method converge q-linearly with the same asymptotic factor as the iterates. This contributes to the long-standing question whether the Broyden matrices converge by showing that this is indeed the case for the setting at hand. Furthermore, we prove that the Broyden directions violate uniform linear independence, which implies that existing results for convergence of the Broyden matrices cannot be applied. Numerical experiments of high precision confirm the enlarged domain of convergence, the q-linear convergence of the matrix updates, and the lack of uniform linear independence. In addition, they suggest that these results can be extended to singularities of higher order and that Broyden's method can converge r-linearly without converging q-linearly. The underlying code is freely available.
The list-decodable code has been an active topic in theoretical computer science.There are general results about the list-decodability to the Johnson radius and the list-decoding capacity theorem. In this paper we show that rates, list-decodable radius and list sizes are closely related to the classical topic of covering codes. We prove new general simple but strong upper bounds for list-decodable codes in general finite metric spaces based on various covering codes. The general covering code upper bounds can be applied to the case that the volumes of the balls depend on the centers, not only on the radius. Then any good upper bound on the covering radius or the size of covering code imply a good upper bound on the sizes of list-decodable codes. Our results give exponential improvements on the recent generalized Singleton upper bound in STOC 2020 for Hamming metric list-decodable codes, when the code lengths are large. A generalized Singleton upper bound for average-radius list-decodable codes is also given from our general covering code upper bound. Even for the list size $L=1$ case our covering code upper bounds give highly non-trivial upper bounds on the sizes of codes with the given minimum distance. We also suggest to study the combinatorial covering list-decodable codes as a natural generalization of combinatorial list-decodable codes. We apply our general covering code upper bounds for list-decodable rank-metric codes, list-decodable subspace codes, list-decodable insertion codes list-decodable deletion codes and list-decodable sum-rank-metric codes. Some new better results about non-list-decodability of rank-metric codes, subspace codes and sum-rank-metric codes are obtained.
It is well known [Lov\'{a}sz, 1967] that up to isomorphism a graph $G$ is determined by the homomorphism counts $\hom(F, G)$, i.e., the number of homomorphisms from $F$ to $G$, where $F$ ranges over all graphs. Moreover, it suffices that $F$ ranges over the graphs with at most as many vertices as $G$. Thus in principle we can answer any query concerning $G$ with only accessing the $\hom(\cdot,G)$'s instead of $G$ itself. In this paper, we zoom in on those queries that can be answered using a constant number of $\hom(\cdot,G)$ for every graph $G$. We observe that if a query $\varphi$ is expressible as a Boolean combination of universal sentences in first-order logic, then whether a graph $G$ satisfies $\varphi$ can be determined by the vector \[\overrightarrow{\mathrm{hom}}_{F_1, \ldots, F_k}(G):= \big(\mathrm{hom}(F_1, G), \ldots, \mathrm{hom}(F_k, G)\big),\] where the graphs $F_1,\ldots,F_k$ only depend on $\varphi$. This leads to a query algorithm for $\varphi$ that is non-adaptive in the sense that those $F_i$ are independent of the input $G$. On the other hand, we prove that the existence of an isolated vertex, which is not definable by such a $\varphi$ but in first-order logic, cannot be determined by any $\overrightarrow{\mathrm{hom}}_{F_1, \ldots, F_k}(\cdot)$. These results provide a clear delineation of the power of non-adaptive query algorithms with access to a constant number of $\hom(\cdot, G)$'s. For adaptive query algorithms, i.e., algorithms that might access some $\hom(F_{i+1}, G)$ with $F_{i+1}$ depending on $\hom(F_1, G), \ldots, \hom(F_i, G)$, we show that three homomorphism counts $\hom(\cdot,G)$ are both sufficient and in general necessary to determine the graph $G$. In particular, by three adaptive queries we can answer any question on $G$. Moreover, adaptively accessing two $\hom(\cdot, G)$'s is already enough to detect an isolated vertex.
We study a novel multi-terminal source coding setup motivated by the biclustering problem. Two separate encoders observe two i.i.d. sequences $X^n$ and $Y^n$, respectively. The goal is to find rate-limited encodings $f(x^n)$ and $g(z^n)$ that maximize the mutual information $I(f(X^n); g(Y^n))/n$. We discuss connections of this problem with hypothesis testing against independence, pattern recognition, and the information bottleneck method. Improving previous cardinality bounds for the inner and outer bounds allows us to thoroughly study the special case of a binary symmetric source and to quantify the gap between the inner and the outer bound in this special case. Furthermore, we investigate a multiple description (MD) extension of the Chief Operating Officer (CEO) problem with mutual information constraint. Surprisingly, this MD-CEO problem permits a tight single-letter characterization of the achievable region.
By concisely representing a joint function of many variables as the combination of small functions, discrete graphical models (GMs) provide a powerful framework to analyze stochastic and deterministic systems of interacting variables. One of the main queries on such models is to identify the extremum of this joint function. This is known as the Weighted Constraint Satisfaction Problem (WCSP) on deterministic Cost Function Networks and as Maximum a Posteriori (MAP) inference on stochastic Markov Random Fields. Algorithms for approximate WCSP inference typically rely on local consistency algorithms or belief propagation. These methods are intimately related to linear programming (LP) relaxations and often coupled with reparametrizations defined by the dual solution of the associated LP. Since the seminal work of Goemans and Williamson, it is well understood that convex SDP relaxations can provide superior guarantees to LP. But the inherent computational cost of interior point methods has limited their application. The situation has improved with the introduction of non-convex Burer-Monteiro style methods which are well suited to handle the SDP relaxation of combinatorial problems with binary variables (such as MAXCUT, MaxSAT or MAP/Ising). We compute low rank SDP upper and lower bounds for discrete pairwise graphical models with arbitrary number of values and arbitrary binary cost functions by extending a Burer-Monteiro style method based on row-by-row updates. We consider a traditional dualized constraint approach and a dedicated Block Coordinate Descent approach which avoids introducing large penalty coefficients to the formulation. On increasingly hard and dense WCSP/CFN instances, we observe that the BCD approach can outperform the dualized approach and provide tighter bounds than local consistencies/convergent message passing approaches.
Lipschitz learning is a graph-based semi-supervised learning method where one extends labels from a labeled to an unlabeled data set by solving the infinity Laplace equation on a weighted graph. In this work we prove uniform convergence rates for solutions of the graph infinity Laplace equation as the number of vertices grows to infinity. Their continuum limits are absolutely minimizing Lipschitz extensions with respect to the geodesic metric of the domain where the graph vertices are sampled from. We work under very general assumptions on the graph weights, the set of labeled vertices, and the continuum domain. Our main contribution is that we obtain quantitative convergence rates even for very sparsely connected graphs, as they typically appear in applications like semi-supervised learning. In particular, our framework allows for graph bandwidths down to the connectivity radius. For proving this we first show a quantitative convergence statement for graph distance functions to geodesic distance functions in the continuum. Using the "comparison with distance functions" principle, we can pass these convergence statements to infinity harmonic functions and absolutely minimizing Lipschitz extensions.
Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.
In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.