Formal theories of arithmetic have traditionally been based on either classical or intuitionistic logic, leading to the development of Peano and Heyting arithmetic, respectively. We propose a use $\mu$MALL as a formal theory of arithmetic based on linear logic. This formal system is presented as a sequent calculus proof system that extends the standard proof system for multiplicative-additive linear logic (MALL) with the addition of the logical connectives universal and existential quantifiers (first-order quantifiers), term equality and non-equality, and the least and greatest fixed point operators. We first demonstrate how functions defined using $\mu$MALL relational specifications can be computed using a simple proof search algorithm. By incorporating weakening and contraction into $\mu$MALL, we obtain $\mu$LK+, a natural candidate for a classical sequent calculus for arithmetic. While important proof theory results are still lacking for $\mu$LK+ (including cut-elimination and the completeness of focusing), we prove that $\mu$LK+ is consistent and that it contains Peano arithmetic. We also prove two conservativity results regarding $\mu$LK+ over $\mu$MALL.
We study the problem of learning to predict the next state of a dynamical system when the underlying evolution function is unknown. Unlike previous work, we place no parametric assumptions on the dynamical system, and study the problem from a learning theory perspective. We define new combinatorial measures and dimensions and show that they quantify the optimal mistake and regret bounds in the realizable and agnostic setting respectively.
The research presents the closed-form outage analysis of the newly presented $\alpha$-modification of the shadowed Beaulieu-Xie fading model for wireless communications. For the considered channel, the closed-form analytical expressions for the outage probability (including its upper and lower bounds), raw moments, amount of fading, and channel quality estimation indicator are derived. The carried out thorough numerical simulation and analysis demonstrates strong agreement with the presented closed-form solutions and illustrates the relationship between the outage probability and channel parameters.
We develop a general theory to optimize the frequentist regret for sequential learning problems, where efficient bandit and reinforcement learning algorithms can be derived from unified Bayesian principles. We propose a novel optimization approach to generate "algorithmic beliefs" at each round, and use Bayesian posteriors to make decisions. The optimization objective to create "algorithmic beliefs," which we term "Algorithmic Information Ratio," represents an intrinsic complexity measure that effectively characterizes the frequentist regret of any algorithm. To the best of our knowledge, this is the first systematical approach to make Bayesian-type algorithms prior-free and applicable to adversarial settings, in a generic and optimal manner. Moreover, the algorithms are simple and often efficient to implement. As a major application, we present a novel algorithm for multi-armed bandits that achieves the "best-of-all-worlds" empirical performance in the stochastic, adversarial, and non-stationary environments. And we illustrate how these principles can be used in linear bandits, bandit convex optimization, and reinforcement learning.
The study of nonplanar drawings of graphs with restricted crossing configurations is a well-established topic in graph drawing, often referred to as beyond-planar graph drawing. One of the most studied types of drawings in this area are the $k$-planar drawings $(k \geq 1)$, where each edge cannot cross more than $k$ times. We generalize $k$-planar drawings, by introducing the new family of min-$k$-planar drawings. In a min-$k$-planar drawing edges can cross an arbitrary number of times, but for any two crossing edges, one of the two must have no more than $k$ crossings. We prove a general upper bound on the number of edges of min-$k$-planar drawings, a finer upper bound for $k=3$, and tight upper bounds for $k=1,2$. Also, we study the inclusion relations between min-$k$-planar graphs (i.e., graphs admitting min-$k$-planar drawings) and $k$-planar graphs. In our setting we only allow simple drawings, that is, any two edges cross at most once, no two adjacent edges cross, and no three edges intersect at a common crossing point.
We introduce off-policy distributional Q($\lambda$), a new addition to the family of off-policy distributional evaluation algorithms. Off-policy distributional Q($\lambda$) does not apply importance sampling for off-policy learning, which introduces intriguing interactions with signed measures. Such unique properties distributional Q($\lambda$) from other existing alternatives such as distributional Retrace. We characterize the algorithmic properties of distributional Q($\lambda$) and validate theoretical insights with tabular experiments. We show how distributional Q($\lambda$)-C51, a combination of Q($\lambda$) with the C51 agent, exhibits promising results on deep RL benchmarks.
We provide a theoretical framework for Reinforcement Learning with Human Feedback (RLHF). Our analysis shows that when the true reward function is linear, the widely used maximum likelihood estimator (MLE) converges under both the Bradley-Terry-Luce (BTL) model and the Plackett-Luce (PL) model. However, we show that when training a policy based on the learned reward model, MLE fails while a pessimistic MLE provides policies with improved performance under certain coverage assumptions. Additionally, we demonstrate that under the PL model, the true MLE and an alternative MLE that splits the $K$-wise comparison into pairwise comparisons both converge. Moreover, the true MLE is asymptotically more efficient. Our results validate the empirical success of existing RLHF algorithms in InstructGPT and provide new insights for algorithm design. Furthermore, our results unify the problem of RLHF and max-entropy Inverse Reinforcement Learning (IRL), and provide the first sample complexity bound for max-entropy IRL.
An essential requirement of spanners in many applications is to be fault-tolerant: a $(1+\epsilon)$-spanner of a metric space is called (vertex) $f$-fault-tolerant ($f$-FT) if it remains a $(1+\epsilon)$-spanner (for the non-faulty points) when up to $f$ faulty points are removed from the spanner. Fault-tolerant (FT) spanners for Euclidean and doubling metrics have been extensively studied since the 90s. For low-dimensional Euclidean metrics, Czumaj and Zhao in SoCG'03 [CZ03] showed that the optimal guarantees $O(f n)$, $O(f)$ and $O(f^2)$ on the size, degree and lightness of $f$-FT spanners can be achieved via a greedy algorithm, which na\"{\i}vely runs in $O(n^3) \cdot 2^{O(f)}$ time. The question of whether the optimal bounds of [CZ03] can be achieved via a fast construction has remained elusive, with the lightness parameter being the bottleneck. Moreover, in the wider family of doubling metrics, it is not even clear whether there exists an $f$-FT spanner with lightness that depends solely on $f$ (even exponentially): all existing constructions have lightness $\Omega(\log n)$ since they are built on the net-tree spanner, which is induced by a hierarchical net-tree of lightness $\Omega(\log n)$. In this paper we settle in the affirmative these longstanding open questions. Specifically, we design a construction of $f$-FT spanners that is optimal with respect to all the involved parameters (size, degree, lightness and running time): For any $n$-point doubling metric, any $\epsilon > 0$, and any integer $1 \le f \le n-2$, our construction provides, within time $O(n \log n + f n)$, an $f$-FT $(1+\epsilon)$-spanner with size $O(f n)$, degree $O(f)$ and lightness $O(f^2)$.
The development of cubical type theory inspired the idea of "extension types" which has been found to have applications in other type theories that are unrelated to homotopy type theory or cubical type theory. This article describes these applications, including on records, metaprogramming, controlling unfolding, and some more exotic ones.
In the current work, we study the eigenvalue distribution results of a class of non-normal matrix-sequences which may be viewed as a low rank perturbation, depending on a parameter $\beta>1$, of the basic Toeplitz matrix-sequence $\{T_n(e^{\mathbf{i}\theta})\}_{n\in\mathbb{N}}$, $\mathbf{i}^2=-1$. The latter of which has obviously all eigenvalues equal to zero for any matrix order $n$, while for the matrix-sequence under consideration we will show a strong clustering on the complex unit circle. A detailed discussion on the outliers is also provided. The problem appears mathematically innocent, but it is indeed quite challenging since all the classical machinery for deducing the eigenvalue clustering does not cover the considered case. In the derivations, we resort to a trick used for the spectral analysis of the Google matrix plus several tools from complex analysis. We only mention that the problem is not an academic curiosity and in fact stems from problems in dynamical systems and number theory. Additionally, we also provide numerical experiments in high precision, a distribution analysis in the Weyl sense concerning both eigenvalues and singular values is given, and more results are sketched for the limit case of $\beta=1$
Resilience against stragglers is a critical element of prediction serving systems, tasked with executing inferences on input data for a pre-trained machine-learning model. In this paper, we propose NeRCC, as a general straggler-resistant framework for approximate coded computing. NeRCC includes three layers: (1) encoding regression and sampling, which generates coded data points, as a combination of original data points, (2) computing, in which a cluster of workers run inference on the coded data points, (3) decoding regression and sampling, which approximately recovers the predictions of the original data points from the available predictions on the coded data points. We argue that the overall objective of the framework reveals an underlying interconnection between two regression models in the encoding and decoding layers. We propose a solution to the nested regressions problem by summarizing their dependence on two regularization terms that are jointly optimized. Our extensive experiments on different datasets and various machine learning models, including LeNet5, RepVGG, and Vision Transformer (ViT), demonstrate that NeRCC accurately approximates the original predictions in a wide range of stragglers, outperforming the state-of-the-art by up to 23%.