Conjugate gradient is an efficient algorithm for solving large sparse linear systems. It has been utilized to accelerate the computation in Bayesian analysis for many large-scale problems. This article discusses the applications of conjugate gradient in Bayesian computation, with a focus on sparse regression and spatial analysis. A self-contained introduction of conjugate gradient is provided to facilitate potential applications in a broader range of problems.
There are many evaluation strategies for term rewrite systems, but proving termination automatically is usually easiest for innermost rewriting. Several syntactic criteria exist when innermost termination implies full termination. We adapt these criteria to the probabilistic setting, e.g., we show when it suffices to analyze almost-sure termination (AST) w.r.t. innermost rewriting to prove full AST of probabilistic term rewrite systems (PTRSs). These criteria also apply to other notions of termination like positive AST. We implemented and evaluated our new contributions in the tool AProVE.
Artificial intelligence holds promise to improve materials discovery. GFlowNets are an emerging deep learning algorithm with many applications in AI-assisted discovery. By using GFlowNets, we generate porous reticular materials, such as metal organic frameworks and covalent organic frameworks, for applications in carbon dioxide capture. We introduce a new Python package (matgfn) to train and sample GFlowNets. We use matgfn to generate the matgfn-rm dataset of novel and diverse reticular materials with gravimetric surface area above 5000 m$^2$/g. We calculate single- and two-component gas adsorption isotherms for the top-100 candidates in matgfn-rm. These candidates are novel compared to the state-of-art ARC-MOF dataset and rank in the 90th percentile in terms of working capacity compared to the CoRE2019 dataset. We discover 15 materials outperforming all materials in CoRE2019.
We consider a new splitting based on the Sherman-Morrison-Woodbury formula, which is particularly effective with iterative methods for the numerical solution of large linear systems. These systems involve matrices that are perturbations of circulant or block circulant matrices, which commonly arise in the discretization of differential equations using finite element or finite difference methods. We prove the convergence of the new iteration without making any assumptions regarding the symmetry or diagonal-dominance of the matrix. To illustrate the efficacy of the new iteration we present various applications. These include extensions of the new iteration to block matrices that arise in certain saddle point problems as well as two-dimensional finite difference discretizations. The new method exhibits fast convergence in all of the test cases we used. It has minimal storage requirements, straightforward implementation and compatibility with nearly circulant matrices via the Fast Fourier Transform. For this reasons it can be a valuable tool for the solution of various finite element and finite difference discretizations of differential equations.
Anomaly detection is an important field that aims to identify unexpected patterns or data points, and it is closely related to many real-world problems, particularly to applications in finance, manufacturing, cyber security, and so on. While anomaly detection has been studied extensively in various fields, detecting future anomalies before they occur remains an unexplored territory. In this paper, we present a novel type of anomaly detection, called Precursor-of-Anomaly (PoA) detection. Unlike conventional anomaly detection, which focuses on determining whether a given time series observation is an anomaly or not, PoA detection aims to detect future anomalies before they happen. To solve both problems at the same time, we present a neural controlled differential equation-based neural network and its multi-task learning algorithm. We conduct experiments using 17 baselines and 3 datasets, including regular and irregular time series, and demonstrate that our presented method outperforms the baselines in almost all cases. Our ablation studies also indicate that the multitasking training method significantly enhances the overall performance for both anomaly and PoA detection.
In the literature on Kleene algebra, a number of variants have been proposed which impose additional structure specified by a theory, such as Kleene algebra with tests (KAT) and the recent Kleene algebra with observations (KAO), or make specific assumptions about certain constants, as for instance in NetKAT. Many of these variants fit within the unifying perspective offered by Kleene algebra with hypotheses, which comes with a canonical language model constructed from a given set of hypotheses. For the case of KAT, this model corresponds to the familiar interpretation of expressions as languages of guarded strings. A relevant question therefore is whether Kleene algebra together with a given set of hypotheses is complete with respect to its canonical language model. In this paper, we revisit, combine and extend existing results on this question to obtain tools for proving completeness in a modular way. We showcase these tools by giving new and modular proofs of completeness for KAT, KAO and NetKAT, and we prove completeness for new variants of KAT: KAT extended with a constant for the full relation, KAT extended with a converse operation, and a version of KAT where the collection of tests only forms a distributive lattice.
Although it has been demonstrated that Natural Language Processing (NLP) algorithms are vulnerable to deliberate attacks, the question of whether such weaknesses can lead to software security threats is under-explored. To bridge this gap, we conducted vulnerability tests on Text-to-SQL systems that are commonly used to create natural language interfaces to databases. We showed that the Text-to-SQL modules within six commercial applications can be manipulated to produce malicious code, potentially leading to data breaches and Denial of Service attacks. This is the first demonstration that NLP models can be exploited as attack vectors in the wild. In addition, experiments using four open-source language models verified that straightforward backdoor attacks on Text-to-SQL systems achieve a 100% success rate without affecting their performance. The aim of this work is to draw the community's attention to potential software security issues associated with NLP algorithms and encourage exploration of methods to mitigate against them.
Data reduction is a fundamental challenge of modern technology, where classical statistical methods are not applicable because of computational limitations. We consider linear regression for an extraordinarily large number of observations, but only a few covariates. Subsampling aims at the selection of a given percentage of the existing original data. Under distributional assumptions on the covariates, we derive D-optimal subsampling designs and study their theoretical properties. We make use of fundamental concepts of optimal design theory and an equivalence theorem from constrained convex optimization. The thus obtained subsampling designs provide simple rules for whether to accept or reject a data point, allowing for an easy algorithmic implementation. In addition, we propose a simplified subsampling method that differs from the D-optimal design but requires lower computing time. We present a simulation study, comparing both subsampling schemes with the IBOSS method.
This work aims to numerically construct exactly commuting matrices close to given almost commuting ones, which is equivalent to the joint approximate diagonalization problem. We first prove that almost commuting matrices generically have approximate common eigenvectors that are almost orthogonal to each other. Based on this key observation, we propose a fast and robust vector-wise joint diagonalization (VJD) algorithm, which constructs the orthogonal similarity transform by sequentially finding these approximate common eigenvectors. In doing so, we consider sub-optimization problems over the unit sphere, for which we present a Riemannian quasi-Newton method with rigorous convergence analysis. We also discuss the numerical stability of the proposed VJD algorithm. Numerical examples with applications in independent component analysis are provided to reveal the relation with Huaxin Lin's theorem and to demonstrate that our method compares favorably with the state-of-the-art Jacobi-type joint diagonalization algorithm.
As artificial intelligence (AI) models continue to scale up, they are becoming more capable and integrated into various forms of decision-making systems. For models involved in moral decision-making, also known as artificial moral agents (AMA), interpretability provides a way to trust and understand the agent's internal reasoning mechanisms for effective use and error correction. In this paper, we provide an overview of this rapidly-evolving sub-field of AI interpretability, introduce the concept of the Minimum Level of Interpretability (MLI) and recommend an MLI for various types of agents, to aid their safe deployment in real-world settings.
Although measuring held-out accuracy has been the primary approach to evaluate generalization, it often overestimates the performance of NLP models, while alternative approaches for evaluating models either focus on individual tasks or on specific behaviors. Inspired by principles of behavioral testing in software engineering, we introduce CheckList, a task-agnostic methodology for testing NLP models. CheckList includes a matrix of general linguistic capabilities and test types that facilitate comprehensive test ideation, as well as a software tool to generate a large and diverse number of test cases quickly. We illustrate the utility of CheckList with tests for three tasks, identifying critical failures in both commercial and state-of-art models. In a user study, a team responsible for a commercial sentiment analysis model found new and actionable bugs in an extensively tested model. In another user study, NLP practitioners with CheckList created twice as many tests, and found almost three times as many bugs as users without it.