We propose reinforcement learning to control the dynamical self-assembly of the dodecagonal quasicrystal (DDQC) from patchy particles. The patchy particles have anisotropic interactions with other particles and form DDQC. However, their structures at steady states are significantly influenced by the kinetic pathways of their structural formation. We estimate the best policy of temperature control trained by the Q-learning method and demonstrate that we can generate DDQC with few defects using the estimated policy. The temperature schedule obtained by reinforcement learning can reproduce the desired structure more efficiently than the conventional pre-fixed temperature schedule, such as annealing. To clarify the success of the learning, we also analyse a simple model describing the kinetics of structural changes through the motion in a triple-well potential. We have found that reinforcement learning autonomously discovers the critical temperature at which structural fluctuations enhance the chance of forming a globally stable state. The estimated policy guides the system toward the critical temperature to assist the formation of DDQC.
Conformal inference is a fundamental and versatile tool that provides distribution-free guarantees for many machine learning tasks. We consider the transductive setting, where decisions are made on a test sample of $m$ new points, giving rise to $m$ conformal $p$-values. {While classical results only concern their marginal distribution, we show that their joint distribution follows a P\'olya urn model, and establish a concentration inequality for their empirical distribution function.} The results hold for arbitrary exchangeable scores, including {\it adaptive} ones that can use the covariates of the test+calibration samples at training stage for increased accuracy. We demonstrate the usefulness of these theoretical results through uniform, in-probability guarantees for two machine learning tasks of current interest: interval prediction for transductive transfer learning and novelty detection based on two-class classification.
The processing and analysis of computed tomography (CT) imaging is important for both basic scientific development and clinical applications. In AutoCT, we provide a comprehensive pipeline that integrates an end-to-end automatic preprocessing, registration, segmentation, and quantitative analysis of 3D CT scans. The engineered pipeline enables atlas-based CT segmentation and quantification leveraging diffeomorphic transformations through efficient forward and inverse mappings. The extracted localized features from the deformation field allow for downstream statistical learning that may facilitate medical diagnostics. On a lightweight and portable software platform, AutoCT provides a new toolkit for the CT imaging community to underpin the deployment of artificial intelligence-driven applications.
Bayesian optimal design of experiments is a well-established approach to planning experiments. Briefly, a probability distribution, known as a statistical model, for the responses is assumed which is dependent on a vector of unknown parameters. A utility function is then specified which gives the gain in information for estimating the true value of the parameters using the Bayesian posterior distribution. A Bayesian optimal design is given by maximising the expectation of the utility with respect to the joint distribution given by the statistical model and prior distribution for the true parameter values. The approach takes account of the experimental aim via specification of the utility and of all assumed sources of uncertainty via the expected utility. However, it is predicated on the specification of the statistical model. Recently, a new type of statistical inference, known as Gibbs (or General Bayesian) inference, has been advanced. This is Bayesian-like, in that uncertainty on unknown quantities is represented by a posterior distribution, but does not necessarily rely on specification of a statistical model. Thus the resulting inference should be less sensitive to misspecification of the statistical model. The purpose of this paper is to propose Gibbs optimal design: a framework for optimal design of experiments for Gibbs inference. The concept behind the framework is introduced along with a computational approach to find Gibbs optimal designs in practice. The framework is demonstrated on exemplars including linear models, and experiments with count and time-to-event responses.
In models of opinion dynamics, many parameters -- either in the form of constants or in the form of functions -- play a critical role in describing, calibrating, and forecasting how opinions change with time. When examining a model of opinion dynamics, it is beneficial to infer its parameters using empirical data. In this paper, we study an example of such an inference problem. We consider a mean-field bounded-confidence model with an unknown interaction kernel between individuals. This interaction kernel encodes how individuals with different opinions interact and affect each other's opinions. Because it is often difficult to quantitatively measure opinions as empirical data from observations or experiments, we assume that the available data takes the form of partial observations of a cumulative distribution function of opinions. We prove that certain measurements guarantee a precise and unique inference of the interaction kernel and propose a numerical method to reconstruct an interaction kernel from a limited number of data points. Our numerical results suggest that the error of the inferred interaction kernel decays exponentially as we strategically enlarge the data set.
Reinforcement learning suffers from limitations in real practices primarily due to the numbers of required interactions with virtual environments. It results in a challenging problem that we are implausible to obtain an optimal strategy only with a few attempts for many learning method. Hereby, we design an improved reinforcement learning method based on model predictive control that models the environment through a data-driven approach. Based on learned environmental model, it performs multi-step prediction to estimate the value function and optimize the policy. The method demonstrates higher learning efficiency, faster convergent speed of strategies tending to the optimal value, and fewer sample capacity space required by experience replay buffers. Experimental results, both in classic databases and in a dynamic obstacle avoidance scenario for unmanned aerial vehicle, validate the proposed approaches.
Structured reinforcement learning leverages policies with advantageous properties to reach better performance, particularly in scenarios where exploration poses challenges. We explore this field through the concept of orchestration, where a (small) set of expert policies guides decision-making; the modeling thereof constitutes our first contribution. We then establish value-functions regret bounds for orchestration in the tabular setting by transferring regret-bound results from adversarial settings. We generalize and extend the analysis of natural policy gradient in Agarwal et al. [2021, Section 5.3] to arbitrary adversarial aggregation strategies. We also extend it to the case of estimated advantage functions, providing insights into sample complexity both in expectation and high probability. A key point of our approach lies in its arguably more transparent proofs compared to existing methods. Finally, we present simulations for a stochastic matching toy model.
Feature attribution is a fundamental task in both machine learning and data analysis, which involves determining the contribution of individual features or variables to a model's output. This process helps identify the most important features for predicting an outcome. The history of feature attribution methods can be traced back to General Additive Models (GAMs), which extend linear regression models by incorporating non-linear relationships between dependent and independent variables. In recent years, gradient-based methods and surrogate models have been applied to unravel complex Artificial Intelligence (AI) systems, but these methods have limitations. GAMs tend to achieve lower accuracy, gradient-based methods can be difficult to interpret, and surrogate models often suffer from stability and fidelity issues. Furthermore, most existing methods do not consider users' contexts, which can significantly influence their preferences. To address these limitations and advance the current state-of-the-art, we define a novel feature attribution framework called Context-Aware Feature Attribution Through Argumentation (CA-FATA). Our framework harnesses the power of argumentation by treating each feature as an argument that can either support, attack or neutralize a prediction. Additionally, CA-FATA formulates feature attribution as an argumentation procedure, and each computation has explicit semantics, which makes it inherently interpretable. CA-FATA also easily integrates side information, such as users' contexts, resulting in more accurate predictions.
Most state-of-the-art machine learning techniques revolve around the optimisation of loss functions. Defining appropriate loss functions is therefore critical to successfully solving problems in this field. We present a survey of the most commonly used loss functions for a wide range of different applications, divided into classification, regression, ranking, sample generation and energy based modelling. Overall, we introduce 33 different loss functions and we organise them into an intuitive taxonomy. Each loss function is given a theoretical backing and we describe where it is best used. This survey aims to provide a reference of the most essential loss functions for both beginner and advanced machine learning practitioners.
Graph-centric artificial intelligence (graph AI) has achieved remarkable success in modeling interacting systems prevalent in nature, from dynamical systems in biology to particle physics. The increasing heterogeneity of data calls for graph neural architectures that can combine multiple inductive biases. However, combining data from various sources is challenging because appropriate inductive bias may vary by data modality. Multimodal learning methods fuse multiple data modalities while leveraging cross-modal dependencies to address this challenge. Here, we survey 140 studies in graph-centric AI and realize that diverse data types are increasingly brought together using graphs and fed into sophisticated multimodal models. These models stratify into image-, language-, and knowledge-grounded multimodal learning. We put forward an algorithmic blueprint for multimodal graph learning based on this categorization. The blueprint serves as a way to group state-of-the-art architectures that treat multimodal data by choosing appropriately four different components. This effort can pave the way for standardizing the design of sophisticated multimodal architectures for highly complex real-world problems.
Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and negative perspectives; (5) theoretical foundations of several special structures in network architectures; and (6) the increasingly intensive concerns in ethics and security and their relationships with generalizability.