We show that it is provable in PA that there is an arithmetically definable sequence $\{\phi_{n}:n \in \omega\}$ of $\Pi^{0}_{2}$-sentences, such that - PRA+$\{\phi_{n}:n \in \omega\}$ is $\Pi^{0}_{2}$-sound and $\Pi^{0}_{1}$-complete - the length of $\phi_{n}$ is bounded above by a polynomial function of $n$ with positive leading coefficient - PRA+$\phi_{n+1}$ always proves 1-consistency of PRA+$\phi_{n}$. One has that the growth in logical strength is in some sense "as fast as possible", manifested in the fact that the total general recursive functions whose totality is asserted by the true $\Pi^{0}_{2}$-sentences in the sequence are cofinal growth-rate-wise in the set of all total general recursive functions. We then develop an argument which makes use of a sequence of sentences constructed by an application of the diagonal lemma, which are generalisations in a broad sense of Hugh Woodin's "Tower of Hanoi" construction as outlined in his essay "Tower of Hanoi" in Chapter 18 of the anthology "Truth in Mathematics". The argument establishes the result that it is provable in PA that $P \neq NP$. We indicate how to pull the argument all the way down into EFA.
We address the problem of model selection for the finite horizon episodic Reinforcement Learning (RL) problem where the transition kernel $P^*$ belongs to a family of models $\mathcal{P}^*$ with finite metric entropy. In the model selection framework, instead of $\mathcal{P}^*$, we are given $M$ nested families of transition kernels $\cP_1 \subset \cP_2 \subset \ldots \subset \cP_M$. We propose and analyze a novel algorithm, namely \emph{Adaptive Reinforcement Learning (General)} (\texttt{ARL-GEN}) that adapts to the smallest such family where the true transition kernel $P^*$ lies. \texttt{ARL-GEN} uses the Upper Confidence Reinforcement Learning (\texttt{UCRL}) algorithm with value targeted regression as a blackbox and puts a model selection module at the beginning of each epoch. Under a mild separability assumption on the model classes, we show that \texttt{ARL-GEN} obtains a regret of $\Tilde{\mathcal{O}}(d_{\mathcal{E}}^*H^2+\sqrt{d_{\mathcal{E}}^* \mathbb{M}^* H^2 T})$, with high probability, where $H$ is the horizon length, $T$ is the total number of steps, $d_{\mathcal{E}}^*$ is the Eluder dimension and $\mathbb{M}^*$ is the metric entropy corresponding to $\mathcal{P}^*$. Note that this regret scaling matches that of an oracle that knows $\mathcal{P}^*$ in advance. We show that the cost of model selection for \texttt{ARL-GEN} is an additive term in the regret having a weak dependence on $T$. Subsequently, we remove the separability assumption and consider the setup of linear mixture MDPs, where the transition kernel $P^*$ has a linear function approximation. With this low rank structure, we propose novel adaptive algorithms for model selection, and obtain (order-wise) regret identical to that of an oracle with knowledge of the true model class.
Consider any locally checkable labeling problem $\Pi$ in rooted regular trees: there is a finite set of labels $\Sigma$, and for each label $x \in \Sigma$ we specify what are permitted label combinations of the children for an internal node of label $x$ (the leaf nodes are unconstrained). This formalism is expressive enough to capture many classic problems studied in distributed computing, including vertex coloring, edge coloring, and maximal independent set. We show that the distributed computational complexity of any such problem $\Pi$ falls in one of the following classes: it is $O(1)$, $\Theta(\log^* n)$, $\Theta(\log n)$, or $n^{\Theta(1)}$ rounds in trees with $n$ nodes (and all of these classes are nonempty). We show that the complexity of any given problem is the same in all four standard models of distributed graph algorithms: deterministic $\mathsf{LOCAL}$, randomized $\mathsf{LOCAL}$, deterministic $\mathsf{CONGEST}$, and randomized $\mathsf{CONGEST}$ model. In particular, we show that randomness does not help in this setting, and the complexity class $\Theta(\log \log n)$ does not exist (while it does exist in the broader setting of general trees). We also show how to systematically determine the complexity class of any such problem $\Pi$, i.e., whether $\Pi$ takes $O(1)$, $\Theta(\log^* n)$, $\Theta(\log n)$, or $n^{\Theta(1)}$ rounds. While the algorithm may take exponential time in the size of the description of $\Pi$, it is nevertheless practical: we provide a freely available implementation of the classifier algorithm, and it is fast enough to classify many problems of interest.
A $c$-crossing-critical graph is one that has crossing number at least $c$ but each of its proper subgraphs has crossing number less than $c$. Recently, a set of explicit construction rules was identified by Bokal, Oporowski, Richter, and Salazar to generate all large $2$-crossing-critical graphs (i.e., all apart from a finite set of small sporadic graphs). They share the property of containing a generalized Wagner graph $V_{10}$ as a subdivision. In this paper, we study these graphs and establish their order, simple crossing number, edge cover number, clique number, maximum degree, chromatic number, chromatic index, and treewidth. We also show that the graphs are linear-time recognizable and that all our proofs lead to efficient algorithms for the above measures.
Consider a random graph $G$ of size $N$ constructed according to a \textit{graphon} $w \, : \, [0,1]^{2} \mapsto [0,1]$ as follows. First embed $N$ vertices $V = \{v_1, v_2, \ldots, v_N\}$ into the interval $[0,1]$, then for each $i < j$ add an edge between $v_{i}, v_{j}$ with probability $w(v_{i}, v_{j})$. Given only the adjacency matrix of the graph, we might expect to be able to approximately reconstruct the permutation $\sigma$ for which $v_{\sigma(1)} < \ldots < v_{\sigma(N)}$ if $w$ satisfies the following \textit{linear embedding} property introduced in [Janssen 2019]: for each $x$, $w(x,y)$ decreases as $y$ moves away from $x$. For a large and non-parametric family of graphons, we show that (i) the popular spectral seriation algorithm [Atkins 1998] provides a consistent estimator $\hat{\sigma}$ of $\sigma$, and (ii) a small amount of post-processing results in an estimate $\tilde{\sigma}$ that converges to $\sigma$ at a nearly-optimal rate, both as $N \rightarrow \infty$.
We investigate the computational complexity of computing the Hausdorff distance. Specifically, we show that the decision problem of whether the Hausdorff distance of two semi-algebraic sets is bounded by a given threshold is complete for the complexity class $\forall\exists_<\mathbb{R}$. This implies that the problem is NP-, co-NP-, $\exists\mathbb{R}$- and $\forall\mathbb{R}$-hard.
The k-Clique problem is a canonical hard problem in parameterized complexity. In this paper, we study the parameterized complexity of approximating the k-Clique problem where an integer k and a graph G on n vertices are given as input, and the goal is to find a clique of size at least k/F(k) whenever the graph G has a clique of size k. When such an algorithm runs in time T(k)poly(n) (i.e., FPT-time) for some computable function T, it is said to be an F(k)-FPT-approximation algorithm for the k-Clique problem. Although, the non-existence of an F(k)-FPT-approximation algorithm for any computable sublinear function F is known under gap-ETH [Chalermsook et al., FOCS 2017], it has remained a long standing open problem to prove the same inapproximability result under the more standard and weaker assumption, W[1]$\neq$FPT. In a recent breakthrough, Lin [STOC 2021] ruled out constant factor (i.e., F(k)=O(1)) FPT-approximation algorithms under W[1]$\neq$FPT. In this paper, we improve this inapproximability result (under the same assumption) to rule out every $F(k)=k^{1/H(k)}$ factor FPT-approximation algorithm for any increasing computable function H (for example $H(k)=\log^\ast k$). Our main technical contribution is introducing list decoding of Hadamard codes over large prime fields into the proof framework of Lin.
The mean width of a convex body is the average distance between parallel supporting hyperplanes when the normal direction is chosen uniformly over the sphere. The Simplex Mean Width Conjecture (SMWC) is a longstanding open problem that says the regular simplex has maximum mean width of all simplices contained in the unit ball and is unique up to isometry. We give a self contained proof of the SMWC in $d$ dimensions. The main idea is that when discussing mean width, $d+1$ vertices $v_i\in\mathbb{S}^{d-1}$ naturally divide $\mathbb{S}^{d-1}$ into $d+1$ Voronoi cells and conversely any partition of $\mathbb{S}^{d-1}$ points to selecting the centroids of regions as vertices. We will show that these two conditions are enough to ensure that a simplex with maximum mean width is a regular simplex.
This paper studies the existence of finite equational axiomatisations of the interleaving parallel composition operator modulo the behavioural equivalences in van Glabbeek's linear time-branching time spectrum. In the setting of the process algebra BCCSP over a finite set of actions, we provide finite, ground-complete axiomatisations for various simulation and (decorated) trace semantics. We also show that no congruence over BCCSP that includes bisimilarity and is included in possible futures equivalence has a finite, ground-complete axiomatisation; this negative result applies to all the nested trace and nested simulation semantics.
Automatic differentiation (AD) aims to compute derivatives of user-defined functions, but in Turing-complete languages, this simple specification does not fully capture AD's behavior: AD sometimes disagrees with the true derivative of a differentiable program, and when AD is applied to non-differentiable or effectful programs, it is unclear what guarantees (if any) hold of the resulting code. We study an expressive differentiable programming language, with piecewise-analytic primitives, higher-order functions, and general recursion. Our main result is that even in this general setting, a version of Lee et al. [2020]'s correctness theorem (originally proven for a first-order language without partiality or recursion) holds: all programs denote so-called $\omega$PAP functions, and AD computes correct intensional derivatives of them. Mazza and Pagani [2021]'s recent theorem, that AD disagrees with the true derivative of a differentiable recursive program at a measure-zero set of inputs, can be derived as a straightforward corollary of this fact. We also apply the framework to study probabilistic programs, and recover a recent result from Mak et al. [2021] via a novel denotational argument.
We consider the L(p,q)-Edge-Labelling problem, which is the edge variant of the well-known L(p,q)-Labelling problem. So far, the complexity of this problem was only partially classified. We complete this study for all nonnegative p and q, by showing that, whenever (p,q) is not (0,0), L(p,q)-Edge-Labelling problem is NP-complete. We do this by proving that for all nonnegative p and q, except p=q=0, there exists an integer k so that L(p,q)-Edge-k-Labelling is NP-complete.