We investigate completions of partial combinatory algebras (pcas), in particular of Kleene's second model $\mathcal{K}_2$ and generalizations thereof. We consider weak and strong notions of embeddability and completion that have been studied before. By a result of Klop it is known that not every pca has a strong completion. The study of completions of $\mathcal{K}_2$ has as corollaries that weak and strong embeddings are different, and that every countable pca has a weak completion. We then consider generalizations of $\mathcal{K}_2$ for larger cardinals, and use these to show that it is consistent that every pca has a weak completion.
We consider systems of polynomial equations and inequalities in $\mathbb{Q}[\boldsymbol{y}][\boldsymbol{x}]$ where $\boldsymbol{x} = (x_1, \ldots, x_n)$ and $\boldsymbol{y} = (y_1, \ldots,y_t)$. The $\boldsymbol{y}$ indeterminates are considered as parameters and we assume that when specialising them generically, the set of common complex solutions, to the obtained equations, is finite. We consider the problem of real root classification for such parameter-dependent problems, i.e. identifying the possible number of real solutions depending on the values of the parameters and computing a description of the regions of the space of parameters over which the number of real roots remains invariant. We design an algorithm for solving this problem. The formulas it outputs enjoy a determinantal structure. Under genericity assumptions, we show that its arithmetic complexity is polynomial in both the maximum degree $d$ and the number $s$ of the input inequalities and exponential in $nt+t^2$. The output formulas consist of polynomials of degree bounded by $(2s+n)d^{n+1}$. This is the first algorithm with such a singly exponential complexity. We report on practical experiments showing that a first implementation of this algorithm can tackle examples which were previously out of reach.
A 2-packing set for an undirected graph $G=(V,E)$ is a subset $\mathcal{S} \subset V$ such that any two vertices $v_1,v_2 \in \mathcal{S}$ have no common neighbors. Finding a 2-packing set of maximum cardinality is a NP-hard problem. We develop a new approach to solve this problem on arbitrary graphs using its close relation to the independent set problem. Thereby, our algorithm red2pack uses new data reduction rules specific to the 2-packing set problem as well as a graph transformation. Our experiments show that we outperform the state-of-the-art for arbitrary graphs with respect to solution quality and also are able to compute solutions multiple orders of magnitude faster than previously possible. For example, we are able to solve 63% of the graphs in the tested data set to optimality in less than a second while the competitor for arbitrary graphs can only solve 5% of these graphs to optimality even with a 10 hour time limit. Moreover, our approach can solve a wide range of large instances that have previously been unsolved.
We study the problem of estimating the score function of an unknown probability distribution $\rho^*$ from $n$ independent and identically distributed observations in $d$ dimensions. Assuming that $\rho^*$ is subgaussian and has a Lipschitz-continuous score function $s^*$, we establish the optimal rate of $\tilde \Theta(n^{-\frac{2}{d+4}})$ for this estimation problem under the loss function $\|\hat s - s^*\|^2_{L^2(\rho^*)}$ that is commonly used in the score matching literature, highlighting the curse of dimensionality where sample complexity for accurate score estimation grows exponentially with the dimension $d$. Leveraging key insights in empirical Bayes theory as well as a new convergence rate of smoothed empirical distribution in Hellinger distance, we show that a regularized score estimator based on a Gaussian kernel attains this rate, shown optimal by a matching minimax lower bound. We also discuss the implication of our theory on the sample complexity of score-based generative models.
We propose and study a new multilevel method for the numerical approximation of a Gibbs distribution $\pi$ on $\mathbb{R}^d$, based on (overdamped) Langevin diffusions. This method inspired by \cite{mainPPlangevin} and \cite{giles_szpruch_invariant} relies on a multilevel occupation measure, $i.e.$ on an appropriate combination of $R$ occupation measures of (constant-step) Euler schemes with respective steps $\gamma_r = \gamma_0 2^{-r}$, $r=0,\ldots,R$. We first state a quantitative result under general assumptions which guarantees an \textit{$\varepsilon$-approximation} (in a $L^2$-sense) with a cost of the order $\varepsilon^{-2}$ or $\varepsilon^{-2}|\log \varepsilon|^3$ under less contractive assumptions. We then apply it to overdamped Langevin diffusions with strongly convex potential $U:\mathbb{R}^d\rightarrow\mathbb{R}$ and obtain an \textit{$\varepsilon$-complexity} of the order ${\cal O}(d\varepsilon^{-2}\log^3(d\varepsilon^{-2}))$ or ${\cal O}(d\varepsilon^{-2})$ under additional assumptions on $U$. More precisely, up to universal constants, an appropriate choice of the parameters leads to a cost controlled by ${(\bar{\lambda}_U\vee 1)^2}{\underline{\lambda}_U^{-3}} d\varepsilon^{-2}$ (where $\bar{\lambda}_U$ and $\underline{\lambda}_U$ respectively denote the supremum and the infimum of the largest and lowest eigenvalue of $D^2U$). We finally complete these theoretical results with some numerical illustrations including comparisons to other algorithms in Bayesian learning and opening to non strongly convex setting.
Krylov methods rely on iterated matrix-vector products $A^k u_j$ for an $n\times n$ matrix $A$ and vectors $u_1,\ldots,u_m$. The space spanned by all iterates $A^k u_j$ admits a particular basis -- the \emph{maximal Krylov basis} -- which consists of iterates of the first vector $u_1, Au_1, A^2u_1,\ldots$, until reaching linear dependency, then iterating similarly the subsequent vectors until a basis is obtained. Finding minimal polynomials and Frobenius normal forms is closely related to computing maximal Krylov bases. The fastest way to produce these bases was, until this paper, Keller-Gehrig's 1985 algorithm whose complexity bound $O(n^\omega \log(n))$ comes from repeated squarings of $A$ and logarithmically many Gaussian eliminations. Here $\omega>2$ is a feasible exponent for matrix multiplication over the base field. We present an algorithm computing the maximal Krylov basis in $O(n^\omega\log\log(n))$ field operations when $m \in O(n)$, and even $O(n^\omega)$ as soon as $m\in O(n/\log(n)^c)$ for some fixed real $c>0$. As a consequence, we show that the Frobenius normal form together with a transformation matrix can be computed deterministically in $O(n^\omega \log\log(n)^2)$, and therefore matrix exponentiation~$A^k$ can be performed in the latter complexity if $\log(k) \in O(n^{\omega-1-\varepsilon})$, for $\varepsilon>0$. A key idea for these improvements is to rely on fast algorithms for $m\times m$ polynomial matrices of average degree $n/m$, involving high-order lifting and minimal kernel bases.
In this contribution, we consider a zero-dimensional polynomial system in $n$ variables defined over a field $\mathbb{K}$. In the context of computing a Rational Univariate Representation (RUR) of its solutions, we address the problem of certifying a separating linear form and, once certified, calculating the RUR that comes from it, without any condition on the ideal else than being zero-dimensional. Our key result is that the RUR can be read (closed formula) from lexicographic Groebner bases of bivariate elimination ideals, even in the case where the original ideal that is not in shape position, so that one can use the same core as the well known FGLM method to propose a simple algorithm. Our first experiments, either with a very short code (300 lines) written in Maple or with a Julia code using straightforward implementations performing only classical Gaussian reductions in addition to Groebner bases for the degree reverse lexicographic ordering, show that this new method is already competitive with sophisticated state of the art implementations which do not certify the parameterizations.
The proper conflict-free chromatic number, $\chi_{pcf}(G)$, of a graph $G$ is the least $k$ such that $G$ has a proper $k$-coloring in which for each non-isolated vertex there is a color appearing exactly once among its neighbors. The proper odd chromatic number, $\chi_{o}(G)$, of $G$ is the least $k$ such that $G$ has a proper coloring in which for every non-isolated vertex there is a color appearing an odd number of times among its neighbors. We say that a graph class $\mathcal{G}$ is $\chi_{pcf}$-bounded ($\chi_{o}$-bounded) if there is a function $f$ such that $\chi_{pcf}(G) \leq f(\chi(G))$ ($\chi_{o}(G) \leq f(\chi(G))$) for every $G \in \mathcal{G}$. Caro et al. (2022) asked for classes that are linearly $\chi_{pcf}$-bounded ($\chi_{pcf}$-bounded), and as a starting point, they showed that every claw-free graph $G$ satisfies $\chi_{pcf}(G) \le 2\Delta(G)+1$, which implies $\chi_{pcf}(G) \le 4\chi(G)+1$. In this paper, we improve the bound for claw-free graphs to a nearly tight bound by showing that such a graph $G$ satisfies $\chi_{pcf}(G) \le \Delta(G)+6$, and even $\chi_{pcf}(G) \le \Delta(G)+4$ if it is a quasi-line graph. These results also give evidence for a conjecture by Caro et al. Moreover, we show that convex-round graphs and permutation graphs are linearly $\chi_{pcf}$-bounded. For these last two results, we prove a lemma that reduces the problem of deciding if a hereditary class is linearly $\chi_{pcf}$-bounded to deciding if the bipartite graphs in the class are $\chi_{pcf}$-bounded by an absolute constant. This lemma complements a theorem of Liu (2022) and motivates us to study boundedness in bipartite graphs. In particular, we show that biconvex bipartite graphs are $\chi_{pcf}$-bounded while convex bipartite graphs are not even $\chi_o$-bounded, and exhibit a class of bipartite circle graphs that is linearly $\chi_o$-bounded but not $\chi_{pcf}$-bounded.
We study the problem of symmetric matrix completion, where the goal is to reconstruct a positive semidefinite matrix $\rm{X}^\star \in \mathbb{R}^{d\times d}$ of rank-$r$, parameterized by $\rm{U}\rm{U}^{\top}$, from only a subset of its observed entries. We show that the vanilla gradient descent (GD) with small initialization provably converges to the ground truth $\rm{X}^\star$ without requiring any explicit regularization. This convergence result holds true even in the over-parameterized scenario, where the true rank $r$ is unknown and conservatively over-estimated by a search rank $r'\gg r$. The existing results for this problem either require explicit regularization, a sufficiently accurate initial point, or exact knowledge of the true rank $r$. In the over-parameterized regime where $r'\geq r$, we show that, with $\widetilde\Omega(dr^9)$ observations, GD with an initial point $\|\rm{U}_0\| \leq \epsilon$ converges near-linearly to an $\epsilon$-neighborhood of $\rm{X}^\star$. Consequently, smaller initial points result in increasingly accurate solutions. Surprisingly, neither the convergence rate nor the final accuracy depends on the over-parameterized search rank $r'$, and they are only governed by the true rank $r$. In the exactly-parameterized regime where $r'=r$, we further enhance this result by proving that GD converges at a faster rate to achieve an arbitrarily small accuracy $\epsilon>0$, provided the initial point satisfies $\|\rm{U}_0\| = O(1/d)$. At the crux of our method lies a novel weakly-coupled leave-one-out analysis, which allows us to establish the global convergence of GD, extending beyond what was previously possible using the classical leave-one-out analysis.
Classical Krylov subspace projection methods for the solution of linear problem $Ax = b$ output an approximate solution $\widetilde{x}\simeq x$. Recently, it has been recognized that projection methods can be understood from a statistical perspective. These probabilistic projection methods return a distribution $p(\widetilde{x})$ in place of a point estimate $\widetilde{x}$. The resulting uncertainty, codified as a distribution, can, in theory, be meaningfully combined with other uncertainties, can be propagated through computational pipelines, and can be used in the framework of probabilistic decision theory. The problem we address is that the current probabilistic projection methods lead to the poorly calibrated posterior distribution. We improve the covariance matrix from previous works in a way that it does not contain such undesirable objects as $A^{-1}$ or $A^{-1}A^{-T}$, results in nontrivial uncertainty, and reproduces an arbitrary projection method as a mean of the posterior distribution. We also propose a variant that is numerically inexpensive in the case the uncertainty is calibrated a priori. Since it usually is not, we put forward a practical way to calibrate uncertainty that performs reasonably well, albeit at the expense of roughly doubling the numerical cost of the underlying projection method.
We expound on some known lower bounds of the quadratic Wasserstein distance between random vectors in $\mathbb{R}^n$ with an emphasis on affine transformations that have been used in manifold learning of data in Wasserstein space. In particular, we give concrete lower bounds for rotated copies of random vectors in $\mathbb{R}^2$ by computing the Bures metric between the covariance matrices. We also derive upper bounds for compositions of affine maps which yield a fruitful variety of diffeomorphisms applied to an initial data measure. We apply these bounds to various distributions including those lying on a 1-dimensional manifold in $\mathbb{R}^2$ and illustrate the quality of the bounds. Finally, we give a framework for mimicking handwritten digit or alphabet datasets that can be applied in a manifold learning framework.