Higher Type Arithmetic (HA$^w$) is a first-order many-sorted theory. It is a conservative extension of Heyting Arithmetic obtained by extending the syntax of terms to all of System T: the objects of interest here are the functionals of higher types. While equality between natural numbers is specified by the axioms of Peano, how can equality between functionals be defined? From this question, different versions of HA$^w$ arise, such as an extensional version (E-HA$^w$) and an intentional version (I-HA$^w$). In this work, we will see how the study of partial equivalence relations leads us to design a translation by parametricity from E-HA$^w$ to HA$^w$.
The notion of $\alpha$-equivalence between $\lambda$-terms is commonly used to identify terms that are considered equal. However, due to the primitive treatment of free variables, this notion falls short when comparing subterms occurring within a larger context. Depending on the usage of the Barendregt convention (choosing different variable names for all involved binders), it will equate either too few or too many subterms. We introduce a formal notion of context-sensitive $\alpha$-equivalence, where two open terms can be compared within a context that resolves their free variables. We show that this equivalence coincides exactly with the notion of bisimulation equivalence. Furthermore, we present an efficient $O(n\log n)$ runtime algorithm that identifies $\lambda$-terms modulo context-sensitive $\alpha$-equivalence, improving upon a previously established $O(n\log^2 n)$ bound for a hashing modulo ordinary $\alpha$-equivalence by Maziarz et al. Hashing $\lambda$-terms is useful in many applications that require common subterm elimination and structure sharing. We employ the algorithm to obtain a large-scale, densely packed, interconnected graph of mathematical knowledge from the Coq proof assistant for machine learning purposes.
Autonomous Micro Aerial Vehicles (MAVs) such as quadrotors equipped with manipulation mechanisms have the potential to assist humans in tasks such as construction and package delivery. Cables are a promising option for manipulation mechanisms due to their low weight, low cost, and simple design. However, designing control and planning strategies for cable mechanisms presents challenges due to indirect load actuation, nonlinear configuration space, and highly coupled system dynamics. In this paper, we propose a novel Nonlinear Model Predictive Control (NMPC) method that enables a team of quadrotors to manipulate a rigid-body payload in all 6 degrees of freedom via suspended cables. Our approach can concurrently exploit, as part of the receding horizon optimization, the available mechanical system redundancies to perform additional tasks such as inter-robot separation and obstacle avoidance while respecting payload dynamics and actuator constraints. To address real-time computational requirements and scalability, we employ a lightweight state vector parametrization that includes only payload states in all six degrees of freedom. This also enables the planning of trajectories on the $SE(3)$ manifold load configuration space, thereby also reducing planning complexity. We validate the proposed approach through simulation and real-world experiments.
This paper investigates the rate-distortion function, under a squared error distortion $D$, for an $n$-dimensional random vector uniformly distributed on an $(n-1)$-sphere of radius $R$. First, an expression for the rate-distortion function is derived for any values of $n$, $D$, and $R$. Second, two types of asymptotics with respect to the rate-distortion function of a Gaussian source are characterized. More specifically, these asymptotics concern the low-distortion regime (that is, $D \to 0$) and the high-dimensional regime (that is, $n \to \infty$).
State Space Models (SSMs) have become serious contenders in the field of sequential modeling, challenging the dominance of Transformers. At the same time, Mixture of Experts (MoE) has significantly improved Transformer-based LLMs, including recent state-of-the-art open-source models. We propose that to unlock the potential of SSMs for scaling, they should be combined with MoE. We showcase this on Mamba, a recent SSM-based model that achieves remarkable, Transformer-like performance. Our model, MoE-Mamba, outperforms both Mamba and Transformer-MoE. In particular, MoE-Mamba reaches the same performance as Mamba in 2.2x less training steps while preserving the inference performance gains of Mamba against the Transformer.
We propose a new approach for approximating functions in $C([0,1]^d)$ via Kolmogorov superposition theorem (KST) based on the linear spline approximation of the K-outer function in Kolmogorov superposition representation. We improve the results in \cite{LaiShenKST21} by showing that the optimal approximation rate based on our proposed approach is $\mathcal{O}(\frac{1}{n^2})$, with $n$ being the number of knots over $[0,1]$, and the approximation constant increases linearly in $d$. We show that there is a dense subclass in $C([0,1]^d)$ whose approximation can achieve such optimal rate, and the number of parameters needed in such approximation is at most $\mathcal{O}(nd)$. Moreover, for $d\geq 4$, we apply the tensor product spline denoising technique to smooth the KB-splines and get the corresponding LKB-splines. We use those LKB-splines as the basis to approximate functions for the cases when $d=4$ and $d=6$, which extends the results in \cite{LaiShenKST21} for $d=2$ and $d=3$. Based on the idea of pivotal data locations introduced in \cite{LaiShenKST21}, we validate via numerical experiments that fewer than $\mathcal{O}(nd)$ function values are enough to achieve the approximation rates such as $\mathcal{O}(\frac{1}{n})$ or $\mathcal{O}(\frac{1}{n^2})$ based on the smoothness of the K-outer function. Finally, we demonstrate that our approach can be applied to numerically solving partial differential equation such as the Poisson equation with accurate approximation results.
In 2021, Casares, Colcombet and Fijalkow introduced the Alternating Cycle Decomposition (ACD), a structure used to define optimal transformations of Muller into parity automata and to obtain theoretical results about the possibility of relabelling automata with different acceptance conditions. In this work, we study the complexity of computing the ACD and its DAG-version, proving that this can be done in polynomial time for suitable representations of the acceptance condition of the Muller automaton. As corollaries, we obtain that we can decide typeness of Muller automata in polynomial time, as well as the parity index of the languages they recognise. Furthermore, we show that we can minimise in polynomial time the number of colours (resp. Rabin pairs) defining a Muller (resp. Rabin) acceptance condition, but that these problems become NP-hard when taking into account the structure of an automaton using such a condition.
$k$-clique listing is a vital graph mining operator with diverse applications in various networks. The state-of-the-art algorithms all adopt a branch-and-bound (BB) framework with a vertex-oriented branching strategy (called VBBkC), which forms a sub-branch by expanding a partial $k$-clique with a vertex. These algorithms have the time complexity of $O(k m (\delta/2)^{k-2})$, where $m$ is the number of edges in the graph and $\delta$ is the degeneracy of the graph. In this paper, we propose a BB framework with a new edge-oriented branching (called EBBkC), which forms a sub-branch by expanding a partial $k$-clique with two vertices that connect each other (which correspond to an edge). We explore various edge orderings for EBBkC such that it achieves a time complexity of $O(\delta m + k m (\tau/2)^{k-2})$, where $\tau$ is an integer related to the maximum truss number of the graph and we have $\tau < \delta$. The time complexity of EBBkC is better than that of VBBkC algorithms for $k>3$ since both $O(\delta m)$ and $O(k m (\tau/2)^{k-2})$ are bounded by $O(k m (\delta/2)^{k-2})$. Furthermore, we develop specialized algorithms for sub-branches on dense graphs so that we can early-terminate them and apply the specialized algorithms. We conduct extensive experiments on 19 real graphs, and the results show that our newly developed EBBkC-based algorithms with the early termination technique consistently and largely outperform the state-of-the-art (VBBkC-based) algorithms.
Program synthesis is the task of automatically generating code based on a specification. In Syntax-Guided Synthesis (SyGuS) this specification is a combination of a syntactic template and a logical formula, and the result is guaranteed to satisfy both. We present a reinforcement-learning guided algorithm for SyGuS which uses Monte-Carlo Tree Search (MCTS) to search the space of candidate solutions. Our algorithm learns policy and value functions which, combined with the upper confidence bound for trees, allow it to balance exploration and exploitation. A common challenge in applying machine learning approaches to syntax-guided synthesis is the scarcity of training data. To address this, we present a method for automatically generating training data for SyGuS based on anti-unification of existing first-order satisfiability problems, which we use to train our MCTS policy. We implement and evaluate this setup and demonstrate that learned policy and value improve the synthesis performance over a baseline by over 26 percentage points in the training and testing sets. Our tool outperforms state-of-the-art tool cvc5 on the training set and performs comparably in terms of the total number of problems solved on the testing set (solving 23% of the benchmarks on which cvc5 fails). We make our data set publicly available, to enable further application of machine learning methods to the SyGuS problem.
Graph Neural Networks (GNNs) have proven to be useful for many different practical applications. However, many existing GNN models have implicitly assumed homophily among the nodes connected in the graph, and therefore have largely overlooked the important setting of heterophily, where most connected nodes are from different classes. In this work, we propose a novel framework called CPGNN that generalizes GNNs for graphs with either homophily or heterophily. The proposed framework incorporates an interpretable compatibility matrix for modeling the heterophily or homophily level in the graph, which can be learned in an end-to-end fashion, enabling it to go beyond the assumption of strong homophily. Theoretically, we show that replacing the compatibility matrix in our framework with the identity (which represents pure homophily) reduces to GCN. Our extensive experiments demonstrate the effectiveness of our approach in more realistic and challenging experimental settings with significantly less training data compared to previous works: CPGNN variants achieve state-of-the-art results in heterophily settings with or without contextual node features, while maintaining comparable performance in homophily settings.
Manually labeling objects by tracing their boundaries is a laborious process. In Polygon-RNN++ the authors proposed Polygon-RNN that produces polygonal annotations in a recurrent manner using a CNN-RNN architecture, allowing interactive correction via humans-in-the-loop. We propose a new framework that alleviates the sequential nature of Polygon-RNN, by predicting all vertices simultaneously using a Graph Convolutional Network (GCN). Our model is trained end-to-end. It supports object annotation by either polygons or splines, facilitating labeling efficiency for both line-based and curved objects. We show that Curve-GCN outperforms all existing approaches in automatic mode, including the powerful PSP-DeepLab and is significantly more efficient in interactive mode than Polygon-RNN++. Our model runs at 29.3ms in automatic, and 2.6ms in interactive mode, making it 10x and 100x faster than Polygon-RNN++.