Due to the lack of a canonical ordering in ${\mathbb R}^d$ for $d>1$, defining multivariate generalizations of the classical univariate ranks has been a long-standing open problem in statistics. Optimal transport has been shown to offer a solution in which multivariate ranks are obtained by transporting data points to a grid that approximates a uniform reference measure (Chernozhukov et al., 2017; Hallin, 2017; Hallin et al., 2021), thereby inducing ranks, signs, and a data-driven ordering of ${\mathbb R}^d$. We take up this new perspective to define and study multivariate analogues of the sign covariance/quadrant statistic, Spearman's rho, Kendall's tau, and van der Waerden covariances. The resulting tests of multivariate independence are fully distribution-free, hence uniformly valid irrespective of the actual (absolutely continuous) distribution of the observations. Our results provide the asymptotic distribution theory for these new test statistics, with asymptotic approximations to critical values to be used for testing independence between random vectors, as well as a power analysis of the resulting tests in an extension of the so-called Konijn model. For the van der Waerden tests, this power analysis includes a multivariate Chernoff--Savage property guaranteeing that, under elliptical generalized Konijn models, the asymptotic relative efficiency with respect to Wilks' classical (pseudo-)Gaussian procedure of our van der Waerden tests is strictly larger than or equal to one, where equality is achieved under Gaussian distributions only. We similarly provide a lower bound for the asymptotic relative efficiency of our Spearman procedure with respect to Wilks' test, thus extending the classical result by Hodges and Lehmann on the asymptotic relative efficiency, in univariate location models, of Wilcoxon tests with respect to the Student ones.
We present a $p$-adic algorithm to recover the lexicographic Gr\"obner basis $\mathcal G$ of an ideal in $\mathbb Q[x,y]$ with a generating set in $\mathbb Z[x,y]$, with a complexity that is less than cubic in terms of the dimension of $\mathbb Q[x,y]/\langle \mathcal G \rangle$ and softly linear in the height of its coefficients. We observe that previous results of Lazard's that use Hermite normal forms to compute Gr\"obner bases for ideals with two generators can be generalized to a set of $t\in \mathbb N^+$ generators. We use this result to obtain a bound on the height of the coefficients of $\mathcal G$, and to control the probability of choosing a \textit{good} prime $p$ to build the $p$-adic expansion of $\mathcal G$.
We consider the problem of approximating a function from $L^2$ by an element of a given $m$-dimensional space $V_m$, associated with some feature map $\varphi$, using evaluations of the function at random points $x_1,\dots,x_n$. After recalling some results on optimal weighted least-squares using independent and identically distributed points, we consider weighted least-squares using projection determinantal point processes (DPP) or volume sampling. These distributions introduce dependence between the points that promotes diversity in the selected features $\varphi(x_i)$. We first provide a generalized version of volume-rescaled sampling yielding quasi-optimality results in expectation with a number of samples $n = O(m\log(m))$, that means that the expected $L^2$ error is bounded by a constant times the best approximation error in $L^2$. Also, further assuming that the function is in some normed vector space $H$ continuously embedded in $L^2$, we further prove that the approximation is almost surely bounded by the best approximation error measured in the $H$-norm. This includes the cases of functions from $L^\infty$ or reproducing kernel Hilbert spaces. Finally, we present an alternative strategy consisting in using independent repetitions of projection DPP (or volume sampling), yielding similar error bounds as with i.i.d. or volume sampling, but in practice with a much lower number of samples. Numerical experiments illustrate the performance of the different strategies.
We prove the existence of a computable function $f\colon\mathbb{N}\to\mathbb{N}$ such that for every integer $k$ and every digraph $D$ either contains a collection $\mathcal{C}$ of $k$ directed cycles of even length such that no vertex of $D$ belongs to more than four cycles in $\mathcal{C}$, or there exists a set $S\subseteq V(D)$ of size at most $f(k)$ such that $D-S$ has no directed cycle of even length. Moreover, we provide an algorithm that finds one of the two outcomes of this statement in time $g(k)n^{\mathcal{O}(1)}$ for some computable function $g\colon \mathbb{N}\to\mathbb{N}$. Our result unites two deep fields of research from the algorithmic theory for digraphs: The study of the Erd\H{o}s-P\'osa property of digraphs and the study of the Even Dicycle Problem. The latter is the decision problem which asks if a given digraph contains an even dicycle and can be traced back to a question of P\'olya from 1913. It remained open until a polynomial time algorithm was finally found by Robertson, Seymour, and Thomas (Ann. of Math. (2) 1999) and, independently, McCuaig (Electron. J. Combin. 2004; announced jointly at STOC 1997). The Even Dicycle Problem is equivalent to the recognition problem of Pfaffian bipartite graphs and has applications even beyond discrete mathematics and theoretical computer science. On the other hand, Younger's Conjecture (1973), states that dicycles have the Erd\H{o}s-P\'osa property. The conjecture was proven more than two decades later by Reed, Robertson, Seymour, and Thomas (Combinatorica 1996) and opened the path for structural digraph theory as well as the algorithmic study of the directed feedback vertex set problem. Our approach builds upon the techniques used to resolve both problems and combines them into a powerful structural theorem that yields further algorithmic applications for other prominent problems.
Given a stochastic matrix $P$ partitioned in four blocks $P_{ij}$, $i,j=1,2$, Kemeny's constant $\kappa(P)$ is expressed in terms of Kemeny's constants of the stochastic complements $P_1=P_{11}+P_{12}(I-P_{22})^{-1}P_{21}$, and $P_2=P_{22}+P_{21}(I-P_{11})^{-1}P_{12}$. Specific cases concerning periodic Markov chains and Kronecker products of stochastic matrices are investigated. Bounds to Kemeny's constant of perturbed matrices are given. Relying on these theoretical results, a divide-and-conquer algorithm for the efficient computation of Kemeny's constant of graphs is designed. Numerical experiments performed on real-world problems show the high efficiency and reliability of this algorithm.
A linearly ordered (LO) $k$-colouring of a hypergraph is a colouring of its vertices with colours $1, \dots, k$ such that each edge contains a unique maximal colour. Deciding whether an input hypergraph admits LO $k$-colouring with a fixed number of colours is NP-complete (and in the special case of graphs, LO colouring coincides with the usual graph colouring). Here, we investigate the complexity of approximating the `linearly ordered chromatic number' of a hypergraph. We prove that the following promise problem is NP-complete: Given a 3-uniform hypergraph, distinguish between the case that it is LO $3$-colourable, and the case that it is not even LO $4$-colourable. We prove this result by a combination of algebraic, topological, and combinatorial methods, building on and extending a topological approach for studying approximate graph colouring introduced by Krokhin, Opr\v{s}al, Wrochna, and \v{Z}ivn\'y (2023).
Sidon spaces have been introduced by Bachoc, Serra and Z\'emor as the $q$-analogue of Sidon sets, classical combinatorial objects introduced by Simon Szidon. In 2018 Roth, Raviv and Tamo introduced the notion of $r$-Sidon spaces, as an extension of Sidon spaces, which may be seen as the $q$-analogue of $B_r$-sets, a generalization of classical Sidon sets. Thanks to their work, the interest on Sidon spaces has increased quickly because of their connection with cyclic subspace codes they pointed out. This class of codes turned out to be of interest since they can be used in random linear network coding. In this work we focus on a particular class of them, the one-orbit cyclic subspace codes, through the investigation of some properties of Sidon spaces and $r$-Sidon spaces, providing some upper and lower bounds on the possible dimension of their \textit{r-span} and showing explicit constructions in the case in which the upper bound is achieved. Moreover, we provide further constructions of $r$-Sidon spaces, arising from algebraic and combinatorial objects, and we show examples of $B_r$-sets constructed by means of them.
We consider the problem of decoding corrupted error correcting codes with NC$^0[\oplus]$ circuits in the classical and quantum settings. We show that any such classical circuit can correctly recover only a vanishingly small fraction of messages, if the codewords are sent over a noisy channel with positive error rate. Previously this was known only for linear codes with large dual distance, whereas our result applies to any code. By contrast, we give a simple quantum circuit that correctly decodes the Hadamard code with probability $\Omega(\varepsilon^2)$ even if a $(1/2 - \varepsilon)$-fraction of a codeword is adversarially corrupted. Our classical hardness result is based on an equidistribution phenomenon for multivariate polynomials over a finite field under biased input-distributions. This is proved using a structure-versus-randomness strategy based on a new notion of rank for high-dimensional polynomial maps that may be of independent interest. Our quantum circuit is inspired by a non-local version of the Bernstein-Vazirani problem, a technique to generate ``poor man's cat states'' by Watts et al., and a constant-depth quantum circuit for the OR function by Takahashi and Tani.
In this article, a heuristic approach is used to determined the best approximate distribution of $\dfrac{Y_1}{Y_1 + Y_2}$, given that $Y_1,Y_2$ are independent, and each of $Y_1$ and $Y$ is distributed as the $\mathcal{F}$-distribution with common denominator degrees of freedom. The proposed approximate distribution is subject to graphical comparisons and distributional tests. The proposed distribution is used to derive the distribution of the elemental regression weight $\omega_E$, where $E$ is the elemental regression set.
We show that the cohomology of the Regge complex in three dimensions is isomorphic to $\mathcal{H}^{{\scriptscriptstyle \bullet}}_{dR}(\Omega)\otimes\mathcal{RM}$, the infinitesimal-rigid-body-motion-valued de~Rham cohomology. Based on an observation that the twisted de~Rham complex extends the elasticity (Riemannian deformation) complex to the linearized version of coframes, connection 1-forms, curvature and Cartan's torsion, we construct a discrete version of linearized Riemann-Cartan geometry on any triangulation and determine its cohomology.
We continue the study of $(\mathrm{tw},\omega)$-bounded graph classes, that is, hereditary graph classes in which the treewidth can only be large due to the presence of a large clique, with the goal of understanding the extent to which this property has useful algorithmic implications for the Independent Set and related problems. In the previous paper of the series [Dallard, Milani\v{c}, and \v{S}torgel, Treewidth versus clique number. II. Tree-independence number], we introduced the tree-independence number, a min-max graph invariant related to tree decompositions. Bounded tree-independence number implies both $(\mathrm{tw},\omega)$-boundedness and the existence of a polynomial-time algorithm for the Maximum Weight Independent Set problem, provided that the input graph is given together with a tree decomposition with bounded independence number. In this paper, we consider six graph containment relations and for each of them characterize the graphs $H$ for which any graph excluding $H$ with respect to the relation admits a tree decomposition with bounded independence number. The induced minor relation is of particular interest: we show that excluding either a $K_5$ minus an edge or the $4$-wheel implies the existence of a tree decomposition in which every bag is a clique plus at most $3$ vertices, while excluding a complete bipartite graph $K_{2,q}$ implies the existence of a tree decomposition with independence number at most $2(q-1)$. Our constructive proofs are obtained using a variety of tools, including $\ell$-refined tree decompositions, SPQR trees, and potential maximal cliques. They imply polynomial-time algorithms for the Independent Set and related problems in an infinite family of graph classes; in particular, the results apply to the class of $1$-perfectly orientable graphs, answering a question of Beisegel, Chudnovsky, Gurvich, Milani\v{c}, and Servatius from 2019.