We introduce an extension of first-order logic that comes equipped with additional predicates for reasoning about an abstract state. Sequents in the logic comprise a main formula together with pre- and postconditions in the style of Hoare logic, and the axioms and rules of the logic ensure that the assertions about the state compose in the correct way. The main result of the paper is a realizability interpretation of our logic that extracts programs into a mixed functional/imperative language. All programs expressible in this language act on the state in a sequential manner, and we make this intuition precise by interpreting them in a semantic metatheory using the state monad. Our basic framework is very general, and our intention is that it can be instantiated and extended in a variety of different ways. We outline in detail one such extension: A monadic version of Heyting arithmetic with a wellfounded while rule, and conclude by outlining several other directions for future work.
We propose a secure compilation chain for statically verified partial programs with input-output (IO). The source language is an F* subset in which a verified IO-performing program interacts with its IO-performing context via a higher-order interface that includes refinement types as well as pre- and post-conditions about past IO events. The target language is a smaller F* subset in which the compiled program is linked with an adversarial context via an interface without refinement types or pre- and post-conditions. To bridge this interface gap and make compilation and linking secure we propose a novel combination of higher-order contracts and reference monitoring for recording and controlling IO operations. During compilation we use contracts to convert the logical assumptions the program makes about the context into dynamic checks on each context-program boundary crossing. These boundary checks can depend on information about past IO events stored in the monitor's state, yet these checks cannot stop the adversarial target context before it performs dangerous IO operations. So, additionally, our linking forces the context to perform all IO via a secure IO library that uses reference monitoring to dynamically enforce an access control policy before each IO operation. We propose a novel way to model in F* that the context cannot directly access the IO operations and the monitor's internal state, based on F*'s recent support for flag-based effect polymorphism. We prove in F* that enforcing the access control policy on the context in combination with static verification of the program soundly enforces a global trace property. Moreover, we prove in F* that our secure compilation chain satisfies by construction Robust Relational Hyperproperty Preservation, a very strong secure compilation criterion. Finally, we illustrate our secure compilation chain at work on a simple web server example.
We study the secretary problem in which rank-ordered lists are generated by the Mallows model and the goal is to identify the highest-ranked candidate through a sequential interview process which does not allow rejected candidates to be revisited. The main difference between our formulation and existing models is that, during the selection process, we are given a fixed number of opportunities to query an infallible expert whether the current candidate is the highest-ranked or not. If the response is positive, the selection process terminates, otherwise, the search continues until a new potentially optimal candidate is identified. Our optimal interview strategy, as well as the expected number of candidates interviewed and the expected number of queries used, can be determined through the evaluation of well-defined recurrence relations. Specifically, if we are allowed to query $s-1$ times and to make a final selection without querying (thus, making $s$ selections in total) then the optimum scheme is characterized by $s$ thresholds that depend on the parameter $\theta$ of the Mallows distribution but are independent on the maximum number of queries.
Abstraction is a commonly used process to represent some low-level system by a more coarse specification with the goal to omit unnecessary details while preserving important aspects. While recent work on abstraction in the situation calculus has focused on non-probabilistic domains, we describe an approach to abstraction of probabilistic and dynamic systems. Based on a variant of the situation calculus with probabilistic belief, we define a notion of bisimulation that allows to abstract a detailed probabilistic basic action theory with noisy actuators and sensors by a possibly non-stochastic basic action theory. By doing so, we obtain abstract Golog programs that omit unnecessary details and which can be translated back to a detailed program for actual execution. This simplifies the implementation of noisy robot programs, opens up the possibility of using non-stochastic reasoning methods (e.g., planning) on probabilistic problems, and provides domain descriptions that are more easily understandable and explainable.
Machine Learning (ML) has widely been used for modeling and predicting physical systems. These techniques offer high expressive power and good generalizability for interpolation within observed data sets. However, the disadvantage of black-box models is that they underperform under blind conditions since no physical knowledge is incorporated. Physics-based ML aims to address this problem by retaining the mathematical flexibility of ML techniques while incorporating physics. In accord, this paper proposes to embed mechanics-based models into the mean function of a Gaussian Process (GP) model and characterize potential discrepancies through kernel machines. A specific class of kernel function is promoted, which has a connection with the gradient of the physics-based model with respect to the input and parameters and shares similarity with the exact Autocovariance function of linear dynamical systems. The spectral properties of the kernel function enable considering dominant periodic processes originating from physics misspecification. Nevertheless, the stationarity of the kernel function is a difficult hurdle in the sequential processing of long data sets, resolved through hierarchical Bayesian techniques. This implementation is also advantageous to mitigate computational costs, alleviating the scalability of GPs when dealing with sequential data. Using numerical and experimental examples, potential applications of the proposed method to structural dynamics inverse problems are demonstrated.
Recently, Graph Neural Networks (GNNs) have significantly advanced the performance of machine learning tasks on graphs. However, this technological breakthrough makes people wonder: how does a GNN make such decisions, and can we trust its prediction with high confidence? When it comes to some critical fields, such as biomedicine, where making wrong decisions can have severe consequences, it is crucial to interpret the inner working mechanisms of GNNs before applying them. In this paper, we propose a model-agnostic model-level explanation method for different GNNs that follow the message passing scheme, GNNInterpreter, to explain the high-level decision-making process of the GNN model. More specifically, GNNInterpreter learns a probabilistic generative graph distribution that produces the most discriminative graph pattern the GNN tries to detect when making a certain prediction by optimizing a novel objective function specifically designed for the model-level explanation for GNNs. Compared to existing works, GNNInterpreter is more flexible and computationally efficient in generating explanation graphs with different types of node and edge features, without introducing another blackbox or requiring manually specified domain-specific rules. In addition, the experimental studies conducted on four different datasets demonstrate that the explanation graphs generated by GNNInterpreter match the desired graph pattern if the model is ideal; otherwise, potential model pitfalls can be revealed by the explanation.
We study the computational expressivity of proof systems with fixed point operators, within the `proofs-as-programs' paradigm. We start with a calculus $\mu\mathsf{LJ}$ (due to Clairambault) that extends intuitionistic logic by least and greatest positive fixed points. Based in the sequent calculus, $\mu\mathsf{LJ}$ admits a standard extension to a `circular' calculus $\mathsf{C}\mu\mathsf{LJ}$. Our main result is that, perhaps surprisingly, both $\mu\mathsf{LJ}$ and $\mathsf{C}\mu\mathsf{LJ}$ represent the same first-order functions: those provably total in $\Pi^1_2$-$\mathsf{CA}_0$, a subsystem of second-order arithmetic beyond the `big five' of reverse mathematics and one of the strongest theories for which we have an ordinal analysis (due to Rathjen). This solves various questions in the literature on the computational strength of (circular) proof systems with fixed points. For the lower bound we give a realisability interpretation from an extension of Peano Arithmetic by fixed points that has been shown to be arithmetically equivalent to $\Pi^1_2$-$\mathsf{CA}_0$ (due to M\"ollerfeld). For the upper bound we construct a novel computability model in order to give a totality argument for circular proofs with fixed points. In fact we formalise this argument itself within $\Pi^1_2$-$\mathsf{CA}_0$ in order to obtain the tight bounds we are after. Along the way we develop some novel reverse mathematics for the Knaster-Tarski fixed point theorem.
Choreographic programming is a paradigm where developers write the global specification (called choreography) of a communicating system, and then a correct-by-construction distributed implementation is compiled automatically. Unfortunately, it is possible to write choreographies that cannot be compiled, because of issues related to an agreement property known as knowledge of choice. This forces programmers to reason manually about implementation details that may be orthogonal to the protocol that they are writing. Amendment is an automatic procedure for repairing uncompilable choreographies. We present a formalisation of amendment from the literature, built upon an existing formalisation of choreographic programming. However, in the process of formalising the expected properties of this procedure, we discovered a subtle counterexample that invalidates the original published and peer-reviewed pen-and-paper theory. We discuss how using a theorem prover led us to both finding the issue, and stating and proving a correct formulation of the properties of amendment.
We adopt an information-theoretic framework to analyze the generalization behavior of the class of iterative, noisy learning algorithms. This class is particularly suitable for study under information-theoretic metrics as the algorithms are inherently randomized, and it includes commonly used algorithms such as Stochastic Gradient Langevin Dynamics (SGLD). Herein, we use the maximal leakage (equivalently, the Sibson mutual information of order infinity) metric, as it is simple to analyze, and it implies both bounds on the probability of having a large generalization error and on its expected value. We show that, if the update function (e.g., gradient) is bounded in $L_2$-norm, then adding isotropic Gaussian noise leads to optimal generalization bounds: indeed, the input and output of the learning algorithm in this case are asymptotically statistically independent. Furthermore, we demonstrate how the assumptions on the update function affect the optimal (in the sense of minimizing the induced maximal leakage) choice of the noise. Finally, we compute explicit tight upper bounds on the induced maximal leakage for several scenarios of interest.
Machine-learning models have demonstrated great success in learning complex patterns that enable them to make predictions about unobserved data. In addition to using models for prediction, the ability to interpret what a model has learned is receiving an increasing amount of attention. However, this increased focus has led to considerable confusion about the notion of interpretability. In particular, it is unclear how the wide array of proposed interpretation methods are related, and what common concepts can be used to evaluate them. We aim to address these concerns by defining interpretability in the context of machine learning and introducing the Predictive, Descriptive, Relevant (PDR) framework for discussing interpretations. The PDR framework provides three overarching desiderata for evaluation: predictive accuracy, descriptive accuracy and relevancy, with relevancy judged relative to a human audience. Moreover, to help manage the deluge of interpretation methods, we introduce a categorization of existing techniques into model-based and post-hoc categories, with sub-groups including sparsity, modularity and simulatability. To demonstrate how practitioners can use the PDR framework to evaluate and understand interpretations, we provide numerous real-world examples. These examples highlight the often under-appreciated role played by human audiences in discussions of interpretability. Finally, based on our framework, we discuss limitations of existing methods and directions for future work. We hope that this work will provide a common vocabulary that will make it easier for both practitioners and researchers to discuss and choose from the full range of interpretation methods.
Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis, thereby allowing manual manipulation in predicting the final answer.