In 2017, Aharoni proposed the following generalization of the Caccetta-H\"{a}ggkvist conjecture: if $G$ is a simple $n$-vertex edge-colored graph with $n$ color classes of size at least $r$, then $G$ contains a rainbow cycle of length at most $\lceil n/r \rceil$. In this paper, we prove that, for fixed $r$, Aharoni's conjecture holds up to an additive constant. Specifically, we show that for each fixed $r \geq 1$, there exists a constant $c_r$ such that if $G$ is a simple $n$-vertex edge-colored graph with $n$ color classes of size at least $r$, then $G$ contains a rainbow cycle of length at most $n/r + c_r$.
The classical Heawood inequality states that if the complete graph $K_n$ on $n$ vertices is embeddable in the sphere with $g$ handles, then $g \ge\dfrac{(n-3)(n-4)}{12}$. A higher-dimensional analogue of the Heawood inequality is the K\"uhnel conjecture. In a simplified form it states that for every integer $k>0$ there is $c_k>0$ such that if the union of $k$-faces of $n$-simplex embeds into the connected sum of $g$ copies of the Cartesian product $S^k\times S^k$ of two $k$-dimensional spheres, then $g\ge c_k n^{k+1}$. For $k>1$ only linear estimates were known. We present a quadratic estimate $g\ge c_k n^2$. The proof is based on beautiful and fruitful interplay between geometric topology, combinatorics and linear algebra.
This thesis is a corpus-based, quantitative, and typological analysis of the functions of Early Slavic participle constructions and their finite competitors ($jegda$-'when'-clauses). The first part leverages detailed linguistic annotation on Early Slavic corpora at the morphosyntactic, dependency, information-structural, and lexical levels to obtain indirect evidence for different potential functions of participle clauses and their main finite competitor and understand the roles of compositionality and default discourse reasoning as explanations for the distribution of participle constructions and $jegda$-clauses in the corpus. The second part uses massively parallel data to analyze typological variation in how languages express the semantic space of English $when$, whose scope encompasses that of Early Slavic participle constructions and $jegda$-clauses. Probabilistic semantic maps are generated and statistical methods (including Kriging, Gaussian Mixture Modelling, precision and recall analysis) are used to induce cross-linguistically salient dimensions from the parallel corpus and to study conceptual variation within the semantic space of the hypothetical concept WHEN.
We consider the problem of efficiently solving a system of $n$ non-linear equations in ${\mathbb R}^d$. Addressing Smale's 17th problem stated in 1998, we consider a setting whereby the $n$ equations are random homogeneous polynomials of arbitrary degrees. In the complex case and for $n= d-1$, Beltr\'{a}n and Pardo proved the existence of an efficient randomized algorithm and Lairez recently showed it can be de-randomized to produce a deterministic efficient algorithm. Here we consider the real setting, to which previously developed methods do not apply. We describe an algorithm that efficiently finds solutions (with high probability) for $n= d -O(\sqrt{d\log d})$. If the maximal degree is very large, we also give an algorithm that works up to $n=d-1$.
Optimization over the set of matrices that satisfy $X^\top B X = I_p$, referred to as the generalized Stiefel manifold, appears in many applications involving sampled covariance matrices such as canonical correlation analysis (CCA), independent component analysis (ICA), and the generalized eigenvalue problem (GEVP). Solving these problems is typically done by iterative methods, such as Riemannian approaches, which require a computationally expensive eigenvalue decomposition involving fully formed $B$. We propose a cheap stochastic iterative method that solves the optimization problem while having access only to a random estimate of the feasible set. Our method does not enforce the constraint in every iteration exactly, but instead it produces iterations that converge to a critical point on the generalized Stiefel manifold defined in expectation. The method has lower per-iteration cost, requires only matrix multiplications, and has the same convergence rates as its Riemannian counterparts involving the full matrix $B$. Experiments demonstrate its effectiveness in various machine learning applications involving generalized orthogonality constraints, including CCA, ICA, and GEVP.
Link streams offer a good model for representing interactions over time. They consist of links $(b,e,u,v)$, where $u$ and $v$ are vertices interacting during the whole time interval $[b,e]$. In this paper, we deal with the problem of enumerating maximal cliques in link streams. A clique is a pair $(C,[t_0,t_1])$, where $C$ is a set of vertices that all interact pairwise during the full interval $[t_0,t_1]$. It is maximal when neither its set of vertices nor its time interval can be increased. Some of the main works solving this problem are based on the famous Bron-Kerbosch algorithm for enumerating maximal cliques in graphs. We take this idea as a starting point to propose a new algorithm which matches the cliques of the instantaneous graphs formed by links existing at a given time $t$ to the maximal cliques of the link stream. We prove its validity and compute its complexity, which is better than the state-of-the art ones in many cases of interest. We also study the output-sensitive complexity, which is close to the output size, thereby showing that our algorithm is efficient. To confirm this, we perform experiments on link streams used in the state of the art, and on massive link streams, up to 100 million links. In all cases our algorithm is faster, mostly by a factor of at least 10 and up to a factor of $10^4$. Moreover, it scales to massive link streams for which the existing algorithms are not able to provide the solution.
Block classical Gram-Schmidt (BCGS) is commonly used for orthogonalizing a set of vectors $X$ in distributed computing environments due to its favorable communication properties relative to other orthogonalization approaches, such as modified Gram-Schmidt or Householder. However, it is known that BCGS (as well as recently developed low-synchronization variants of BCGS) can suffer from a significant loss of orthogonality in finite-precision arithmetic, which can contribute to instability and inaccurate solutions in downstream applications such as $s$-step Krylov subspace methods. A common solution to improve the orthogonality among the vectors is reorthogonalization. Focusing on the "Pythagorean" variant of BCGS, introduced in [E. Carson, K. Lund, & M. Rozlo\v{z}n\'{i}k. SIAM J. Matrix Anal. Appl. 42(3), pp. 1365--1380, 2021], which guarantees an $O(\varepsilon)\kappa^2(X)$ bound on the loss of orthogonality as long as $O(\varepsilon)\kappa^2(X)<1$, where $\varepsilon$ denotes the unit roundoff, we introduce and analyze two reorthogonalized Pythagorean BCGS variants. These variants feature favorable communication properties, with asymptotically two synchronization points per block column, as well as an improved $O(\varepsilon)$ bound on the loss of orthogonality. Our bounds are derived in a general fashion to additionally allow for the analysis of mixed-precision variants. We verify our theoretical results with a panel of test matrices and experiments from a new version of the \texttt{BlockStab} toolbox.
By using the notion of $d$-embedding $\Gamma$ of a (canonical) subgeometry $\Sigma$ and of exterior set with respect to the $h$-secant variety $\Omega_{h}(\mathcal{A})$ of a subset $\mathcal{A}$, $ 0 \leq h \leq n-1$, in the finite projective space $\mathrm{PG}(n-1,q^n)$, $n \geq 3$, in this article we construct a class of non-linear $(n,n,q;d)$-MRD codes for any $ 2 \leq d \leq n-1$. A code $\mathcal{C}_{\sigma,T}$ of this class, where $1\in T \subset \mathbb{F}_q^*$ and $\sigma$ is a generator of $\mathrm{Gal}(\mathbb{F}_{q^n}|\mathbb{F}_q)$, arises from a cone of $\mathrm{PG}(n-1,q^n)$ with vertex an $(n-d-2)$-dimensional subspace over a maximum exterior set $\mathcal{E}$ with respect to $\Omega_{d-2}(\Gamma)$. We prove that the codes introduced in [Cossidente, A., Marino, G., Pavese, F.: Non-linear maximum rank distance codes. Des. Codes Cryptogr. 79, 597--609 (2016); Durante, N., Siciliano, A.: Non-linear maximum rank distance codes in the cyclic model for the field reduction of finite geometries. Electron. J. Comb. (2017); Donati, G., Durante, N.: A generalization of the normal rational curve in $\mathrm{PG}(d,q^n)$ and its associated non-linear MRD codes. Des. Codes Cryptogr. 86, 1175--1184 (2018)] are appropriate punctured ones of $\mathcal{C}_{\sigma,T}$ and solve completely the inequivalence issue for this class showing that $\mathcal{C}_{\sigma,T}$ is neither equivalent nor adjointly equivalent to the non-linear MRD code $\mathcal{C}_{n,k,\sigma,I}$, $I \subseteq \mathbb{F}_q$, obtained in [Otal, K., \"Ozbudak, F.: Some new non-additive maximum rank distance codes. Finite Fields and Their Applications 50, 293--303 (2018).].
We generalize McDiarmid's inequality for functions with bounded differences on a high probability set, using an extension argument. Those functions concentrate around their conditional expectations. We further extend the results to concentration in general metric spaces.
Network reconstruction consists in determining the unobserved pairwise couplings between $N$ nodes given only observational data on the resulting behavior that is conditioned on those couplings -- typically a time-series or independent samples from a graphical model. A major obstacle to the scalability of algorithms proposed for this problem is a seemingly unavoidable quadratic complexity of $\Omega(N^2)$, corresponding to the requirement of each possible pairwise coupling being contemplated at least once, despite the fact that most networks of interest are sparse, with a number of non-zero couplings that is only $O(N)$. Here we present a general algorithm applicable to a broad range of reconstruction problems that significantly outperforms this quadratic baseline. Our algorithm relies on a stochastic second neighbor search (Dong et al., 2011) that produces the best edge candidates with high probability, thus bypassing an exhaustive quadratic search. If we rely on the conjecture that the second-neighbor search finishes in log-linear time (Baron & Darling, 2020; 2022), we demonstrate theoretically that our algorithm finishes in subquadratic time, with a data-dependent complexity loosely upper bounded by $O(N^{3/2}\log N)$, but with a more typical log-linear complexity of $O(N\log^2N)$. In practice, we show that our algorithm achieves a performance that is many orders of magnitude faster than the quadratic baseline -- in a manner consistent with our theoretical analysis -- allows for easy parallelization, and thus enables the reconstruction of networks with hundreds of thousands and even millions of nodes and edges.
Recently, \citeauthor*{akbari2021locality}~(ICALP 2023) studied the locality of graph problems in distributed, sequential, dynamic, and online settings from a {unified} point of view. They designed a novel $O(\log n)$-locality deterministic algorithm for proper 3-coloring bipartite graphs in the $\mathsf{Online}$-$\mathsf{LOCAL}$ model. In this work, we establish the optimality of the algorithm by showing a \textit{tight} deterministic $\Omega(\log n)$ locality lower bound, which holds even on grids. To complement this result, we have the following additional results: \begin{enumerate} \item We show a higher and {tight} $\Omega(\sqrt{n})$ lower bound for 3-coloring toroidal and cylindrical grids. \item Considering the generalization of $3$-coloring bipartite graphs to $(k+1)$-coloring $k$-partite graphs, %where $k \geq 2$ is a constant, we show that the problem also has $O(\log n)$ locality when the input is a $k$-partite graph that admits a \emph{locally inferable unique coloring}. This special class of $k$-partite graphs covers several fundamental graph classes such as $k$-trees and triangular grids. Moreover, for this special class of graphs, we show a {tight} $\Omega(\log n)$ locality lower bound. \item For general $k$-partite graphs with $k \geq 3$, we prove that the problem of $(2k-2)$-coloring $k$-partite graphs exhibits a locality of $\Omega(n)$ in the $\onlineLOCAL$ model, matching the round complexity of the same problem in the $\LOCAL$ model recently shown by \citeauthor*{coiteux2023no}~(STOC 2024). Consequently, the problem of $(k+1)$-coloring $k$-partite graphs admits a locality lower bound of $\Omega(n)$ when $k\geq 3$, contrasting sharply with the $\Theta(\log n)$ locality for the case of $k=2$. \end{enumerate}