A linear code $C$ over $\mathbb{F}_q$ is called $\Delta$-divisible if the Hamming weights $\operatorname{wt}(c)$ of all codewords $c \in C$ are divisible by $\Delta$. The possible effective lengths of $q^r$-divisible codes have been completely characterized for each prime power $q$ and each non-negative integer $r$. The study of $\Delta$ divisible codes was initiated by Harold Ward. If $c$ divides $\Delta$ but is coprime to $q$, then each $\Delta$-divisible code $C$ over $\F_q$ is the $c$-fold repetition of a $\Delta/c$-divisible code. Here we determine the possible effective lengths of $p^r$-divisible codes over finite fields of characteristic $p$, where $p\in\mathbb{N}$ but $p^r$ is not a power of the field size, i.e., the missing cases.
We present a $p$-adic algorithm to recover the lexicographic Gr\"obner basis $\mathcal G$ of an ideal in $\mathbb Q[x,y]$ with a generating set in $\mathbb Z[x,y]$, with a complexity that is less than cubic in terms of the dimension of $\mathbb Q[x,y]/\langle \mathcal G \rangle$ and softly linear in the height of its coefficients. We observe that previous results of Lazard's that use Hermite normal forms to compute Gr\"obner bases for ideals with two generators can be generalized to a set of $t\in \mathbb N^+$ generators. We use this result to obtain a bound on the height of the coefficients of $\mathcal G$, and to control the probability of choosing a \textit{good} prime $p$ to build the $p$-adic expansion of $\mathcal G$.
We consider the problem of approximating a function from $L^2$ by an element of a given $m$-dimensional space $V_m$, associated with some feature map $\varphi$, using evaluations of the function at random points $x_1,\dots,x_n$. After recalling some results on optimal weighted least-squares using independent and identically distributed points, we consider weighted least-squares using projection determinantal point processes (DPP) or volume sampling. These distributions introduce dependence between the points that promotes diversity in the selected features $\varphi(x_i)$. We first provide a generalized version of volume-rescaled sampling yielding quasi-optimality results in expectation with a number of samples $n = O(m\log(m))$, that means that the expected $L^2$ error is bounded by a constant times the best approximation error in $L^2$. Also, further assuming that the function is in some normed vector space $H$ continuously embedded in $L^2$, we further prove that the approximation is almost surely bounded by the best approximation error measured in the $H$-norm. This includes the cases of functions from $L^\infty$ or reproducing kernel Hilbert spaces. Finally, we present an alternative strategy consisting in using independent repetitions of projection DPP (or volume sampling), yielding similar error bounds as with i.i.d. or volume sampling, but in practice with a much lower number of samples. Numerical experiments illustrate the performance of the different strategies.
We prove the existence of a computable function $f\colon\mathbb{N}\to\mathbb{N}$ such that for every integer $k$ and every digraph $D$ either contains a collection $\mathcal{C}$ of $k$ directed cycles of even length such that no vertex of $D$ belongs to more than four cycles in $\mathcal{C}$, or there exists a set $S\subseteq V(D)$ of size at most $f(k)$ such that $D-S$ has no directed cycle of even length. Moreover, we provide an algorithm that finds one of the two outcomes of this statement in time $g(k)n^{\mathcal{O}(1)}$ for some computable function $g\colon \mathbb{N}\to\mathbb{N}$. Our result unites two deep fields of research from the algorithmic theory for digraphs: The study of the Erd\H{o}s-P\'osa property of digraphs and the study of the Even Dicycle Problem. The latter is the decision problem which asks if a given digraph contains an even dicycle and can be traced back to a question of P\'olya from 1913. It remained open until a polynomial time algorithm was finally found by Robertson, Seymour, and Thomas (Ann. of Math. (2) 1999) and, independently, McCuaig (Electron. J. Combin. 2004; announced jointly at STOC 1997). The Even Dicycle Problem is equivalent to the recognition problem of Pfaffian bipartite graphs and has applications even beyond discrete mathematics and theoretical computer science. On the other hand, Younger's Conjecture (1973), states that dicycles have the Erd\H{o}s-P\'osa property. The conjecture was proven more than two decades later by Reed, Robertson, Seymour, and Thomas (Combinatorica 1996) and opened the path for structural digraph theory as well as the algorithmic study of the directed feedback vertex set problem. Our approach builds upon the techniques used to resolve both problems and combines them into a powerful structural theorem that yields further algorithmic applications for other prominent problems.
Temporal knowledge graphs represent temporal facts $(s,p,o,\tau)$ relating a subject $s$ and an object $o$ via a relation label $p$ at time $\tau$, where $\tau$ could be a time point or time interval. Temporal knowledge graphs may exhibit static temporal patterns at distinct points in time and dynamic temporal patterns between different timestamps. In order to learn a rich set of static and dynamic temporal patterns and apply them for inference, several embedding approaches have been suggested in the literature. However, as most of them resort to single underlying embedding spaces, their capability to model all kinds of temporal patterns was severely limited by having to adhere to the geometric property of their one embedding space. We lift this limitation by an embedding approach that maps temporal facts into a product space of several heterogeneous geometric subspaces with distinct geometric properties, i.e.\ Complex, Dual, and Split-complex spaces. In addition, we propose a temporal-geometric attention mechanism to integrate information from different geometric subspaces conveniently according to the captured relational and temporal information. Experimental results on standard temporal benchmark datasets favorably evaluate our approach against state-of-the-art models.
Given a stochastic matrix $P$ partitioned in four blocks $P_{ij}$, $i,j=1,2$, Kemeny's constant $\kappa(P)$ is expressed in terms of Kemeny's constants of the stochastic complements $P_1=P_{11}+P_{12}(I-P_{22})^{-1}P_{21}$, and $P_2=P_{22}+P_{21}(I-P_{11})^{-1}P_{12}$. Specific cases concerning periodic Markov chains and Kronecker products of stochastic matrices are investigated. Bounds to Kemeny's constant of perturbed matrices are given. Relying on these theoretical results, a divide-and-conquer algorithm for the efficient computation of Kemeny's constant of graphs is designed. Numerical experiments performed on real-world problems show the high efficiency and reliability of this algorithm.
Graphs of bounded degeneracy are known to contain induced paths of order $\Omega(\log \log n)$ when they contain a path of order $n$, as proved by Ne\v{s}et\v{r}il and Ossona de Mendez (2012). In 2016 Esperet, Lemoine, and Maffray conjectured that this bound could be improved to $\Omega((\log n)^c)$ for some constant $c>0$ depending on the degeneracy. We disprove this conjecture by constructing, for arbitrarily large values of $n$, a graph that is 2-degenerate, has a path of order $n$, and where all induced paths have order $O((\log \log n)^2)$. We also show that the graphs we construct have linearly bounded coloring numbers.
A linearly ordered (LO) $k$-colouring of a hypergraph is a colouring of its vertices with colours $1, \dots, k$ such that each edge contains a unique maximal colour. Deciding whether an input hypergraph admits LO $k$-colouring with a fixed number of colours is NP-complete (and in the special case of graphs, LO colouring coincides with the usual graph colouring). Here, we investigate the complexity of approximating the `linearly ordered chromatic number' of a hypergraph. We prove that the following promise problem is NP-complete: Given a 3-uniform hypergraph, distinguish between the case that it is LO $3$-colourable, and the case that it is not even LO $4$-colourable. We prove this result by a combination of algebraic, topological, and combinatorial methods, building on and extending a topological approach for studying approximate graph colouring introduced by Krokhin, Opr\v{s}al, Wrochna, and \v{Z}ivn\'y (2023).
The LASSO is a recent technique for variable selection in the regression model \bean y & = & X\beta + z, \eean where $X\in \R^{n\times p}$ and $z$ is a centered gaussian i.i.d. noise vector $\mathcal N(0,\sigma^2I)$. The LASSO has been proved to achieve remarkable properties such as exact support recovery of sparse vectors when the columns are sufficently incoherent and low prediction error under even less stringent conditions. However, many matrices do not satisfy small coherence in practical applications and the LASSO estimator may thus suffer from what is known as the slow rate regime. The goal of the present paper is to study the LASSO from a slightly different perspective by proposing a mixture model for the design matrix which is able to capture in a natural way the potentially clustered nature of the columns in many practical situations. In this model, the columns of the design matrix are drawn from a Gaussian mixture model. Instead of requiring incoherence for the design matrix $X$, we only require incoherence of the much smaller matrix of the mixture's centers. Our main result states that $X\beta$ can be estimated with the same precision as for incoherent designs except for a correction term depending on the maximal variance in the mixture model.
For a permutation $\pi: [K]\rightarrow [K]$, a sequence $f: \{1,2,\cdots, n\}\rightarrow \mathbb R$ contains a $\pi$-pattern of size $K$, if there is a sequence of indices $(i_1, i_2, \cdots, i_K)$ ($i_1<i_2<\cdots<i_K$), satisfying that $f(i_a)<f(i_b)$ if $\pi(a)<\pi(b)$, for $a,b\in [K]$. Otherwise, $f$ is referred to as $\pi$-free. For the special case where $\pi = (1,2,\cdots, K)$, it is referred to as the monotone pattern. \cite{newman2017testing} initiated the study of testing $\pi$-freeness with one-sided error. They focused on two specific problems, testing the monotone permutations and the $(1,3,2)$ permutation. For the problem of testing monotone permutation $(1,2,\cdots,K)$, \cite{ben2019finding} improved the $(\log n)^{O(K^2)}$ non-adaptive query complexity of \cite{newman2017testing} to $O((\log n)^{\lfloor \log_{2} K\rfloor})$. Further, \cite{ben2019optimal} proposed an adaptive algorithm with $O(\log n)$ query complexity. However, no progress has yet been made on the problem of testing $(1,3,2)$-freeness. In this work, we present an adaptive algorithm for testing $(1,3,2)$-freeness. The query complexity of our algorithm is $O(\epsilon^{-2}\log^4 n)$, which significantly improves over the $O(\epsilon^{-7}\log^{26}n)$-query adaptive algorithm of \cite{newman2017testing}. This improvement is mainly achieved by the proposal of a new structure embedded in the patterns.
Given a Binary Decision Diagram $B$ of a Boolean function $\varphi$ in $n$ variables, it is well known that all $\varphi$-models can be enumerated in output polynomial time, and in a compressed way (using don't-care symbols). We show that all $N$ many $\varphi$-models of fixed Hamming-weight $k$ can be enumerated as well in time polynomial in $n$ and $|B|$ and $N$. Furthermore, using novel wildcards, again enables a compressed enumeration of these models.