A central problem in proof-theory is that of finding criteria for identity of proofs, that is, for when two distinct formal derivations can be taken as denoting the same logical argument. In the literature one finds criteria which are either based on proof normalization (two derivations denote the same proofs when they have the same normal form) or on the association of formal derivations with graph-theoretic structures (two derivations denote they same proof when they are associated with the same graph). In this paper we argue for a new criterion for identity of proofs which arises from the interpretation of formal rules and derivations as natural transformations of a suitable kind. We show that the naturality conditions arising from this interpretation capture in a uniform and elegant ways several forms of "rule-permutations" which are found in proof-systems for propositional, first- and second-order logic.
Following an extensive simulation study comparing the operating characteristics of three different procedures used for establishing equivalence (the frequentist `TOST", the Bayesian "HDI-ROPE", and the Bayes factor interval null procedure), Linde et al. (2021) conclude with the recommendation that "researchers rely more on the Bayes factor interval null approach for quantifying evidence for equivalence." We redo the simulation study of Linde et al. (2021) in its entirety but with the different procedures calibrated to have the same predetermined maximum type 1 error rate. Our results suggest that the Bayes Factor, HDI-ROPE, and frequentist equivalence testing are all essentially equivalent when it comes to predicting equivalence. In general any advocating for frequentist testing as better or worse than Bayesian testing in terms of empirical findings seems dubious at best. If one decides on which underlying principle to subscribe to in tackling a given problem, then the method follows naturally. Bearing in mind that each procedure can be reverse-engineered from the others (at least approximately), trying to use empirical performance to argue for one approach over another seems like tilting at windmills.
Convolution and self-attention are two powerful techniques for representation learning, and they are usually considered as two peer approaches that are distinct from each other. In this paper, we show that there exists a strong underlying relation between them, in the sense that the bulk of computations of these two paradigms are in fact done with the same operation. Specifically, we first show that a traditional convolution with kernel size k x k can be decomposed into k^2 individual 1x1 convolutions, followed by shift and summation operations. Then, we interpret the projections of queries, keys, and values in self-attention module as multiple 1x1 convolutions, followed by the computation of attention weights and aggregation of the values. Therefore, the first stage of both two modules comprises the similar operation. More importantly, the first stage contributes a dominant computation complexity (square of the channel size) comparing to the second stage. This observation naturally leads to an elegant integration of these two seemingly distinct paradigms, i.e., a mixed model that enjoys the benefit of both self-Attention and Convolution (ACmix), while having minimum computational overhead compared to the pure convolution or self-attention counterpart. Extensive experiments show that our model achieves consistently improved results over competitive baselines on image recognition and downstream tasks. Code and pre-trained models will be released at //github.com/Panxuran/ACmix and //gitee.com/mindspore/models.
We study the problem of checking the existence of a step-by-step transformation of $d$-regular induced subgraphs in a graph, where $d \ge 0$ and each step in the transformation must follow a fixed reconfiguration rule. Our problem for $d=0$ is equivalent to \textsc{Independent Set Reconfiguration}, which is one of the most well-studied reconfiguration problems. In this paper, we systematically investigate the complexity of the problem, in particular, on chordal graphs and bipartite graphs. Our results give interesting contrasts to known ones for \textsc{Independent Set Reconfiguration}.
The present paper mainly studies limits and constructions of insertion and deletion (insdel for short) codes. The paper can be divided into two parts. The first part focuses on various bounds, while the second part concentrates on constructions of insdel codes. Although the insdel-metric Singleton bound has been derived before, it is still unknown if there are any nontrivial codes achieving this bound. Our first result shows that any nontrivial insdel codes do not achieve the insdel-metric Singleton bound. The second bound shows that every $[n,k]$ Reed-Solomon code has insdel distance upper bounded by $2n-4k+4$ and it is known in literature that an $[n,k]$ Reed-Solomon code can have insdel distance $2n-4k+4$ as long as the field size is sufficiently large. The third bound shows a trade-off between insdel distance and code alphabet size for codes achieving the Hamming-metric Singleton bound. In the second part of the paper, we first provide a non-explicit construction of nonlinear codes that can approach the insdel-metric Singleton bound arbitrarily when the code alphabet size is sufficiently large. The second construction gives two-dimensional Reed-Solomon codes of length $n$ and insdel distance $2n-4$ with field size $q=O(n^5)$.
Loop acceleration can be used to prove safety, reachability, runtime bounds, and (non-)termination of programs operating on integers. To this end, a variety of acceleration techniques has been proposed. However, all of them are monolithic: Either they accelerate a loop successfully, or they fail completely. In contrast, we present a calculus that allows for combining acceleration techniques in a modular way and we show how to integrate many existing acceleration techniques into our calculus. Moreover, we propose two novel acceleration techniques that can be incorporated into our calculus seamlessly. Some of these acceleration techniques apply only to non-terminating loops. Thus, combining them with our novel calculus results in a new, modular approach for proving non-termination. An empirical evaluation demonstrates the applicability of our approach, both for loop acceleration and for proving non-termination.
This paper deals with the $\lambda$-labeling and $L(2,1)$-coloring of simple graphs. A $\lambda$-labeling of a graph $G$ is any labeling of the vertices of $G$ with different labels such that any two adjacent vertices receive labels which differ at least two. Also an $L(2,1)$-coloring of $G$ is any labeling of the vertices of $G$ such that any two adjacent vertices receive labels which differ at least two and any two vertices with distance two receive distinct labels. Assume that a partial $\lambda$-labeling $f$ is given in a graph $G$. A general question is whether $f$ can be extended to a $\lambda$-labeling of $G$. We show that the extension is feasible if and only if a Hamiltonian path consistent with some distance constraints exists in the complement of $G$. Then we consider line graph of bipartite multigraphs and determine the minimum number of labels in $L(2,1)$-coloring and $\lambda$-labeling of these graphs. In fact we obtain easily computable formulas for the path covering number and the maximum path of the complement of these graphs. We obtain a polynomial time algorithm which generates all Hamiltonian paths in the related graphs. A special case is the Cartesian product graph $K_n\Box K_n$ and the generation of $\lambda$-squares.
Quantum Error Correction (QEC) is essential for fault-tolerant quantum copmutation, and its implementation is a very sophisticated process involving both quantum and classical hardware. Formulating and verifying the decomposition of logical operations into physical ones is a challenge in itself. In this paper, we propose QECV, a verification framework that can efficiently verify the formal correctness of stabilizer codes, arguably the most important class of QEC codes. QECV first comes with a concise language, QECV-Lang, where stabilizers are treated as a first-class object, to represent QEC programs. Stabilizers are also used as predicates in our new assertion language, QECV-Assn, as logical and arithmetic operations of stabilizers can be naturally defined. We derive a sound quantum Hoare logic proof system with a set of inference rules for QECV to efficiently reason about the correctness of QEC programs. We demonstrate the effectiveness of QECV with both theoretical complexity analysis and in-depth case studies of two well-known stabilizer QEC codes, the repetition code and the surface code.
A Bayesian Discrepancy Test (BDT) is proposed to evaluate the distance of a given hypothesis with respect to the available information (prior law and data). The proposed measure of evidence has properties of consistency and invariance. After having presented the similarities and differences between the BDT and other Bayesian tests, we proceed with the analysis of some multiparametric case studies, showing the properties of the BDT. Among them conceptual and interpretative simplicity, possibility of dealing with complex case studies.
We study the problem of checking the existence of a step-by-step transformation of $d$-regular induced subgraphs in a graph, where $d \ge 0$ and each step in the transformation must follow a fixed reconfiguration rule. Our problem for $d=0$ is equivalent to \textsc{Independent Set Reconfiguration}, which is one of the most well-studied reconfiguration problems. In this paper, we systematically investigate the complexity of the problem, in particular, on chordal graphs and bipartite graphs. Our results give interesting contrasts to known ones for \textsc{Independent Set Reconfiguration}.
The goal of Out-of-Distribution (OOD) generalization problem is to train a predictor that generalizes on all environments. Popular approaches in this field use the hypothesis that such a predictor shall be an \textit{invariant predictor} that captures the mechanism that remains constant across environments. While these approaches have been experimentally successful in various case studies, there is still much room for the theoretical validation of this hypothesis. This paper presents a new set of theoretical conditions necessary for an invariant predictor to achieve the OOD optimality. Our theory not only applies to non-linear cases, but also generalizes the necessary condition used in \citet{rojas2018invariant}. We also derive Inter Gradient Alignment algorithm from our theory and demonstrate its competitiveness on MNIST-derived benchmark datasets as well as on two of the three \textit{Invariance Unit Tests} proposed by \citet{aubinlinear}.