For a metric $\mu$ on a finite set $T$, the minimum 0-extension problem 0-Ext$[\mu]$ is defined as follows: Given $V\supseteq T$ and $\ c:{V \choose 2}\rightarrow \mathbf{Q_+}$, minimize $\sum c(xy)\mu(\gamma(x),\gamma(y))$ subject to $\gamma:V\rightarrow T,\ \gamma(t)=t\ (\forall t\in T)$, where the sum is taken over all unordered pairs in $V$. This problem generalizes several classical combinatorial optimization problems such as the minimum cut problem or the multiterminal cut problem. Karzanov and Hirai established a complete classification of metrics $\mu$ for which 0-Ext$[\mu]$ is polynomial time solvable or NP-hard. This result can also be viewed as a sharpening of the general dichotomy theorem for finite-valued CSPs (Thapper and \v{Z}ivn\'{y} 2016) specialized to 0-Ext$[\mu]$. In this paper, we consider a directed version $\overrightarrow{0}$-Ext$[\mu]$ of the minimum 0-extension problem, where $\mu$ and $c$ are not assumed to be symmetric. We extend the NP-hardness condition of 0-Ext$[\mu]$ to $\overrightarrow{0}$-Ext$[\mu]$: If $\mu$ cannot be represented as the shortest path metric of an orientable modular graph with an orbit-invariant ``directed'' edge-length, then $\overrightarrow{0}$-Ext$[\mu]$ is NP-hard. We also show a partial converse: If $\mu$ is a directed metric of a modular lattice with an orbit-invariant directed edge-length, then $\overrightarrow{0}$-Ext$[\mu]$ is tractable. We further provide a new NP-hardness condition characteristic of $\overrightarrow{0}$-Ext$[\mu]$, and establish a dichotomy for the case where $\mu$ is a directed metric of a star.
If $G$ is a group, we say a subset $S$ of $G$ is product-free if the equation $xy=z$ has no solutions with $x,y,z \in S$. For $D \in \mathbb{N}$, a group $G$ is said to be $D$-quasirandom if the minimal dimension of a nontrivial complex irreducible representation of $G$ is at least $D$. Gowers showed that in a $D$-quasirandom finite group $G$, the maximal size of a product-free set is at most $|G|/D^{1/3}$. This disproved a longstanding conjecture of Babai and S\'os from 1985. For the special unitary group, $G=SU(n)$, Gowers observed that his argument yields an upper bound of $n^{-1/3}$ on the measure of a measurable product-free subset. In this paper, we improve Gowers' upper bound to $\exp(-cn^{1/3})$, where $c>0$ is an absolute constant. In fact, we establish something stronger, namely, product-mixing for measurable subsets of $SU(n)$ with measure at least $\exp(-cn^{1/3})$; for this product-mixing result, the $n^{1/3}$ in the exponent is sharp. Our approach involves introducing novel hypercontractive inequalities, which imply that the non-Abelian Fourier spectrum of the indicator function of a small set concentrates on high-dimensional irreducible representations. Our hypercontractive inequalities are obtained via methods from representation theory, harmonic analysis, random matrix theory and differential geometry. We generalize our hypercontractive inequalities from $SU(n)$ to an arbitrary $D$-quasirandom compact connected Lie group for $D$ at least an absolute constant, thereby extending our results on product-free sets to such groups. We also demonstrate various other applications of our inequalities to geometry (viz., non-Abelian Brunn-Minkowski type inequalities), mixing times, and the theory of growth in compact Lie groups.
Fault-tolerant connectivity labelings are schemes that, given an $n$-vertex graph $G=(V,E)$ and $f\geq 1$, produce succinct yet informative labels for the elements of the graph. Given only the labels of two vertices $u,v$ and of the elements in a faulty-set $F$ with $|F|\leq f$, one can determine if $u,v$ are connected in $G-F$, the surviving graph after removing $F$. For the edge or vertex faults models, i.e., $F\subseteq E$ or $F\subseteq V$, a sequence of recent work established schemes with $poly(f,\log n)$-bit labels. This paper considers the color faults model, recently introduced in the context of spanners [Petruschka, Sapir and Tzalik, ITCS'24], which accounts for known correlations between failures. Here, the edges (or vertices) of the input $G$ are arbitrarily colored, and the faulty elements in $F$ are colors; a failing color causes all edges (vertices) of that color to crash. Our main contribution is settling the label length complexity for connectivity under one color fault ($f=1$). The existing implicit solution, by applying the state-of-the-art scheme for edge faults of [Dory and Parter, PODC'21], might yield labels of $\Omega(n)$ bits. We provide a deterministic scheme with labels of $\tilde{O}(\sqrt{n})$ bits in the worst case, and a matching lower bound. Moreover, our scheme is universally optimal: even schemes tailored to handle only colorings of one specific graph topology cannot produce asymptotically smaller labels. We extend our labeling approach to yield a routing scheme avoiding a single forbidden color. We also consider the centralized setting, and show an $\tilde{O}(n)$-space oracle, answering connectivity queries under one color fault in $\tilde{O}(1)$ time. Turning to $f\geq 2$ color faults, we give a randomized labeling scheme with $\tilde{O}(n^{1-1/2^f})$-bit labels, along with a lower bound of $\Omega(n^{1-1/(f+1)})$ bits.
We show that, for every $k\geq 2$, $C_{2k}$-freeness can be decided in $O(n^{1-1/k})$ rounds in the \CONGEST{} model by a randomized Monte-Carlo distributed algorithm with one-sided error probability $1/3$. This matches the best round-complexities of previously known algorithms for $k\in\{2,3,4,5\}$ by Drucker et al. [PODC'14] and Censor-Hillel et al. [DISC'20], but improves the complexities of the known algorithms for $k>5$ by Eden et al. [DISC'19], which were essentially of the form $\tilde O(n^{1-2/k^2})$. Our algorithm uses colored BFS-explorations with threshold, but with an original \emph{global} approach that enables to overcome a recent impossibility result by Fraigniaud et al. [SIROCCO'23] about using colored BFS-exploration with \emph{local} threshold for detecting cycles. We also show how to quantize our algorithm for achieving a round-complexity $\tilde O(n^{\frac{1}{2}-\frac{1}{2k}})$ in the quantum setting for deciding $C_{2k}$ freeness. Furthermore, this allows us to improve the known quantum complexities of the simpler problem of detecting cycles of length \emph{at most}~$2k$ by van Apeldoorn and de Vos [PODC'22]. Our quantization is in two steps. First, the congestion of our randomized algorithm is reduced, to the cost of reducing its success probability too. Second, the success probability is boosted using a new quantum framework derived from sequential algorithms, namely Monte-Carlo quantum amplification.
We construct explicit pseudorandom generators that fool $n$-variate polynomials of degree at most $d$ over a finite field $\mathbb{F}_q$. The seed length of our generators is $O(d \log n + \log q)$, over fields of size exponential in $d$ and characteristic at least $d(d-1)+1$. Previous constructions such as Bogdanov's (STOC 2005) and Derksen and Viola's (FOCS 2022) had either suboptimal seed length or required the field size to depend on $n$. Our approach follows Bogdanov's paradigm while incorporating techniques from Lecerf's factorization algorithm (J. Symb. Comput. 2007) and insights from the construction of Derksen and Viola regarding the role of indecomposability of polynomials.
We propose an $\widetilde{O}(n + 1/\varepsilon)$-time FPTAS (Fully Polynomial-Time Approximation Scheme) for the classical Partition problem. This is the best possible (up to a logarithmic factor) assuming SETH (Strong Exponential Time Hypothesis) [Abboud, Bringmann, Hermelin, and Shabtay'22]. Prior to our work, the best known FPTAS for Partition runs in $\widetilde{O}(n + 1/\varepsilon^{5/4})$ time [Deng, Jin and Mao'23, Wu and Chen'22]. Our result is obtained by solving a more general problem of weakly approximating Subset Sum.
We study the fundamental problem of estimating the mean of a $d$-dimensional distribution with covariance $\Sigma \preccurlyeq \sigma^2 I_d$ given $n$ samples. When $d = 1$, \cite{catoni} showed an estimator with error $(1+o(1)) \cdot \sigma \sqrt{\frac{2 \log \frac{1}{\delta}}{n}}$, with probability $1 - \delta$, matching the Gaussian error rate. For $d>1$, a natural estimator outputs the center of the minimum enclosing ball of one-dimensional confidence intervals to achieve a $1-\delta$ confidence radius of $\sqrt{\frac{2 d}{d+1}} \cdot \sigma \left(\sqrt{\frac{d}{n}} + \sqrt{\frac{2 \log \frac{1}{\delta}}{n}}\right)$, incurring a $\sqrt{\frac{2d}{d+1}}$-factor loss over the Gaussian rate. When the $\sqrt{\frac{d}{n}}$ term dominates by a $\sqrt{\log \frac{1}{\delta}}$ factor, \cite{lee2022optimal-highdim} showed an improved estimator matching the Gaussian rate. This raises a natural question: Is the $\sqrt{\frac{2 d}{d+1}}$ loss \emph{necessary} when the $\sqrt{\frac{2 \log \frac{1}{\delta}}{n}}$ term dominates? We show that the answer is \emph{no} -- we construct an estimator that improves over the above naive estimator by a constant factor. We also consider robust estimation, where an adversary is allowed to corrupt an $\epsilon$-fraction of samples arbitrarily: in this case, we show that the above strategy of combining one-dimensional estimates and incurring the $\sqrt{\frac{2d}{d+1}}$-factor \emph{is} optimal in the infinite-sample limit.
Recent work has shown that it is possible to train an $\textit{unsupervised}$ automatic speech recognition (ASR) system using only unpaired audio and text. Existing unsupervised ASR methods assume that no labeled data can be used for training. We argue that even if one does not have any labeled audio for a given language, there is $\textit{always}$ labeled data available for other languages. We show that it is possible to use character-level acoustic models (AMs) from other languages to bootstrap an $\textit{unsupervised}$ AM in a new language. Here, "unsupervised" means no labeled audio is available for the $\textit{target}$ language. Our approach is based on two key ingredients: (i) generating pseudo-labels (PLs) of the $\textit{target}$ language using some $\textit{other}$ language AM and (ii) constraining these PLs with a $\textit{target language model}$. Our approach is effective on Common Voice: e.g. transfer of English AM to Swahili achieves 18% WER. It also outperforms character-based wav2vec-U 2.0 by 15% absolute WER on LJSpeech with 800h of labeled German data instead of 60k hours of unlabeled English data.
We study the task of efficiently sampling from a Gibbs distribution $d \pi^* = e^{-h} d {vol}_g$ over a Riemannian manifold $M$ via (geometric) Langevin MCMC; this algorithm involves computing exponential maps in random Gaussian directions and is efficiently implementable in practice. The key to our analysis of Langevin MCMC is a bound on the discretization error of the geometric Euler-Murayama scheme, assuming $\nabla h$ is Lipschitz and $M$ has bounded sectional curvature. Our error bound matches the error of Euclidean Euler-Murayama in terms of its stepsize dependence. Combined with a contraction guarantee for the geometric Langevin Diffusion under Kendall-Cranston coupling, we prove that the Langevin MCMC iterates lie within $\epsilon$-Wasserstein distance of $\pi^*$ after $\tilde{O}(\epsilon^{-2})$ steps, which matches the iteration complexity for Euclidean Langevin MCMC. Our results apply in general settings where $h$ can be nonconvex and $M$ can have negative Ricci curvature. Under additional assumptions that the Riemannian curvature tensor has bounded derivatives, and that $\pi^*$ satisfies a $CD(\cdot,\infty)$ condition, we analyze the stochastic gradient version of Langevin MCMC, and bound its iteration complexity by $\tilde{O}(\epsilon^{-2})$ as well.
We explore an extension to straight-line programs (SLPs) that outperforms, for some text families, the measure $\delta$ based on substring complexity, a lower bound for most measures and compressors exploiting repetitiveness (which are crucial in areas like Bioinformatics). The extension, called iterated SLPs (ISLPs), allows rules of the form $A \rightarrow \Pi_{i=k_1}^{k_2} B_1^{i^{c_1}}\cdots B_t^{i^{c_t}}$, for which we show how to extract any substring of length $\lambda$, from the represented text $T[1.. n]$, in time $O(\lambda + \log^2 n\log\log n)$. This is the first compressed representation for repetitive texts breaking $\delta$ while, at the same time, supporting direct access to arbitrary text symbols in polylogarithmic time. As a byproduct, we extend Ganardi et al.'s technique to balance any SLP (so it has a derivation tree of logarithmic height) to a wide generalization of SLPs, including ISLPs.
Bayesian inference for Dirichlet-Multinomial (DM) models has a long and important history. The concentration parameter $\alpha$ is pivotal in smoothing category probabilities within the multinomial distribution and is crucial for the inference afterward. Due to the lack of a tractable form of its marginal likelihood, $\alpha$ is often chosen ad-hoc, or estimated using approximation algorithms. A constant $\alpha$ often leads to inadequate smoothing of probabilities, particularly for sparse compositional count datasets. In this paper, we introduce a novel class of prior distributions facilitating conjugate updating of the concentration parameter, allowing for full Bayesian inference for DM models. Our methodology is based on fast residue computation and admits closed-form posterior moments in specific scenarios. Additionally, our prior provides continuous shrinkage with its heavy tail and substantial mass around zero, ensuring adaptability to the sparsity or quasi-sparsity of the data. We demonstrate the usefulness of our approach on both simulated examples and on a real-world human microbiome dataset. Finally, we conclude with directions for future research.