Successfully addressing a wide variety of tasks is a core ability of autonomous agents, requiring flexibly adapting the underlying decision-making strategies and, as we argue in this work, also adapting the perception modules. An analogical argument would be the human visual system, which uses top-down signals to focus attention determined by the current task. Similarly, we adapt pre-trained large vision models conditioned on specific downstream tasks in the context of multi-task policy learning. We introduce task-conditioned adapters that do not require finetuning any pre-trained weights, combined with a single policy trained with behavior cloning and capable of addressing multiple tasks. We condition the visual adapters on task embeddings, which can be selected at inference if the task is known, or alternatively inferred from a set of example demonstrations. To this end, we propose a new optimization-based estimator. We evaluate the method on a wide variety of tasks from the CortexBench benchmark and show that, compared to existing work, it can be addressed with a single policy. In particular, we demonstrate that adapting visual features is a key design choice and that the method generalizes to unseen tasks given a few demonstrations.
The calibration of constitutive models from full-field data has recently gained increasing interest due to improvements in full-field measurement capabilities. In addition to the experimental characterization of novel materials, continuous structural health monitoring is another application that is of great interest. However, monitoring is usually associated with severe time constraints, difficult to meet with standard numerical approaches. Therefore, parametric physics-informed neural networks (PINNs) for constitutive model calibration from full-field displacement data are investigated. In an offline stage, a parametric PINN can be trained to learn a parameterized solution of the underlying partial differential equation. In the subsequent online stage, the parametric PINN then acts as a surrogate for the parameters-to-state map in calibration. We test the proposed approach for the deterministic least-squares calibration of a linear elastic as well as a hyperelastic constitutive model from noisy synthetic displacement data. We further carry out Markov chain Monte Carlo-based Bayesian inference to quantify the uncertainty. A proper statistical evaluation of the results underlines the high accuracy of the deterministic calibration and that the estimated uncertainty is valid. Finally, we consider experimental data and show that the results are in good agreement with a Finite Element Method-based calibration. Due to the fast evaluation of PINNs, calibration can be performed in near real-time. This advantage is particularly evident in many-query applications such as Markov chain Monte Carlo-based Bayesian inference.
In class-incremental learning, an agent with limited resources needs to learn a sequence of classification tasks, forming an ever growing classification problem, with the constraint of not being able to access data from previous tasks. The main difference with task-incremental learning, where a task-ID is available at inference time, is that the learner also needs to perform cross-task discrimination, i.e. distinguish between classes that have not been seen together. Approaches to tackle this problem are numerous and mostly make use of an external memory (buffer) of non-negligible size. In this paper, we ablate the learning of cross-task features and study its influence on the performance of basic replay strategies used for class-IL. We also define a new forgetting measure for class-incremental learning, and see that forgetting is not the principal cause of low performance. Our experimental results show that future algorithms for class-incremental learning should not only prevent forgetting, but also aim to improve the quality of the cross-task features, and the knowledge transfer between tasks. This is especially important when tasks contain limited amount of data.
The main objective of this work is to explore the possibility of incorporating radiomic information from multiple lesions into survival models. We hypothesise that when more lesions are present, their inclusion can improve model performance, and we aim to find an optimal strategy for using multiple distinct regions in modelling. The idea of using multiple regions of interest (ROIs) to extract radiomic features for predictive models has been implemented in many recent works. However, in almost all studies, analogous regions were segmented according to particular criteria for all patients -- for example, the primary tumour and peritumoral area, or subregions of the primary tumour. They can be included in a model in a straightforward way as additional features. A more interesting scenario occurs when multiple distinct ROIs are present, such as multiple lesions in a regionally disseminated cancer. Since the number of such regions may differ between patients, their inclusion in a model is non-trivial and requires additional processing steps. We proposed several methods of handling multiple ROIs representing either ROI or risk aggregation strategy, compared them to a published one, and evaluated their performance in different classes of survival models in a Monte Carlo Cross-Validation scheme. We demonstrated the effectiveness of the methods using a cohort of 115 non-small cell lung cancer patients, for whom we predicted the metastasis risk based on features extracted from PET images in original resolution or interpolated to CT image resolution. For both feature sets, incorporating all available lesions, as opposed to a singular ROI representing the primary tumour, allowed for considerable improvement of predictive ability regardless of the model.
The vulnerability of neural network classifiers to adversarial attacks is a major obstacle to their deployment in safety-critical applications. Regularization of network parameters during training can be used to improve adversarial robustness and generalization performance. Usually, the network is regularized end-to-end, with parameters at all layers affected by regularization. However, in settings where learning representations is key, such as self-supervised learning (SSL), layers after the feature representation will be discarded when performing inference. For these models, regularizing up to the feature space is more suitable. To this end, we propose a new spectral regularizer for representation learning that encourages black-box adversarial robustness in downstream classification tasks. In supervised classification settings, we show empirically that this method is more effective in boosting test accuracy and robustness than previously-proposed methods that regularize all layers of the network. We then show that this method improves the adversarial robustness of classifiers using representations learned with self-supervised training or transferred from another classification task. In all, our work begins to unveil how representational structure affects adversarial robustness.
Traditional low-rank approximation is a powerful tool to compress the huge data matrices that arise in simulations of partial differential equations (PDE), but suffers from high computational cost and requires several passes over the PDE data. The compressed data may also lack interpretability thus making it difficult to identify feature patterns from the original data. To address this issue, we present an online randomized algorithm to compute the interpolative decomposition (ID) of large-scale data matrices in situ. Compared to previous randomized IDs that used the QR decomposition to determine the column basis, we adopt a streaming ridge leverage score-based column subset selection algorithm that dynamically selects proper basis columns from the data and thus avoids an extra pass over the data to compute the coefficient matrix of the ID. In particular, we adopt a single-pass error estimator based on the non-adaptive Hutch++ algorithm to provide real-time error approximation for determining the best coefficients. As a result, our approach only needs a single pass over the original data and thus is suitable for large and high-dimensional matrices stored outside of core memory or generated in PDE simulations. We also provide numerical experiments on turbulent channel flow and ignition simulations, and on the NSTX Gas Puff Image dataset, comparing our algorithm with the offline ID algorithm to demonstrate its utility in real-world applications.
We present a new technique for visualizing high-dimensional data called cluster MDS (cl-MDS), which addresses a common difficulty of dimensionality reduction methods: preserving both local and global structures of the original sample in a single 2-dimensional visualization. Its algorithm combines the well-known multidimensional scaling (MDS) tool with the $k$-medoids data clustering technique, and enables hierarchical embedding, sparsification and estimation of 2-dimensional coordinates for additional points. While cl-MDS is a generally applicable tool, we also include specific recipes for atomic structure applications. We apply this method to non-linear data of increasing complexity where different layers of locality are relevant, showing a clear improvement in their retrieval and visualization quality.
Observational epidemiological studies commonly seek to estimate the causal effect of an exposure on an outcome. Adjustment for potential confounding bias in modern studies is challenging due to the presence of high-dimensional confounding, induced when there are many confounders relative to sample size, or complex relationships between continuous confounders and exposure and outcome. As a promising avenue to overcome this challenge, doubly robust methods (Augmented Inverse Probability Weighting (AIPW) and Targeted Maximum Likelihood Estimation (TMLE)) enable the use of data-adaptive approaches to fit the two models they involve. Biased standard errors may result when the data-adaptive approaches used are very complex. The coupling of doubly robust methods with cross-fitting has been proposed to tackle this. Despite advances, limited evaluation, comparison, and guidance are available on the implementation of AIPW and TMLE with data-adaptive approaches and cross-fitting in realistic settings where high-dimensional confounding is present. We conducted an extensive simulation study to compare the relative performance of AIPW and TMLE using data-adaptive approaches in estimating the average causal effect (ACE) and evaluated the benefits of using cross-fitting with a varying number of folds, as well as the impact of using a reduced versus full (larger, more diverse) library in the Super Learner (SL) ensemble learning approach used for the data-adaptive models. A range of scenarios in terms of data generation, and sample size were considered. We found that AIPW and TMLE performed similarly in most cases for estimating the ACE, but TMLE was more stable. Cross-fitting improved the performance of both methods, with the number of folds a less important consideration. Using a full SL library was important to reduce bias and variance in the complex scenarios typical of modern health research studies.
Inference for functional linear models in the presence of heteroscedastic errors has received insufficient attention given its practical importance; in fact, even a central limit theorem has not been studied in this case. At issue, conditional mean estimates have complicated sampling distributions due to the infinite dimensional regressors, where truncation bias and scaling issues are compounded by non-constant variance under heteroscedasticity. As a foundation for distributional inference, we establish a central limit theorem for the estimated conditional mean under general dependent errors, and subsequently we develop a paired bootstrap method to provide better approximations of sampling distributions. The proposed paired bootstrap does not follow the standard bootstrap algorithm for finite dimensional regressors, as this version fails outside of a narrow window for implementation with functional regressors. The reason owes to a bias with functional regressors in a naive bootstrap construction. Our bootstrap proposal incorporates debiasing and thereby attains much broader validity and flexibility with truncation parameters for inference under heteroscedasticity; even when the naive approach may be valid, the proposed bootstrap method performs better numerically. The bootstrap is applied to construct confidence intervals for centered projections and for conducting hypothesis tests for the multiple conditional means. Our theoretical results on bootstrap consistency are demonstrated through simulation studies and also illustrated with a real data example.
Graph-centric artificial intelligence (graph AI) has achieved remarkable success in modeling interacting systems prevalent in nature, from dynamical systems in biology to particle physics. The increasing heterogeneity of data calls for graph neural architectures that can combine multiple inductive biases. However, combining data from various sources is challenging because appropriate inductive bias may vary by data modality. Multimodal learning methods fuse multiple data modalities while leveraging cross-modal dependencies to address this challenge. Here, we survey 140 studies in graph-centric AI and realize that diverse data types are increasingly brought together using graphs and fed into sophisticated multimodal models. These models stratify into image-, language-, and knowledge-grounded multimodal learning. We put forward an algorithmic blueprint for multimodal graph learning based on this categorization. The blueprint serves as a way to group state-of-the-art architectures that treat multimodal data by choosing appropriately four different components. This effort can pave the way for standardizing the design of sophisticated multimodal architectures for highly complex real-world problems.
In large-scale systems there are fundamental challenges when centralised techniques are used for task allocation. The number of interactions is limited by resource constraints such as on computation, storage, and network communication. We can increase scalability by implementing the system as a distributed task-allocation system, sharing tasks across many agents. However, this also increases the resource cost of communications and synchronisation, and is difficult to scale. In this paper we present four algorithms to solve these problems. The combination of these algorithms enable each agent to improve their task allocation strategy through reinforcement learning, while changing how much they explore the system in response to how optimal they believe their current strategy is, given their past experience. We focus on distributed agent systems where the agents' behaviours are constrained by resource usage limits, limiting agents to local rather than system-wide knowledge. We evaluate these algorithms in a simulated environment where agents are given a task composed of multiple subtasks that must be allocated to other agents with differing capabilities, to then carry out those tasks. We also simulate real-life system effects such as networking instability. Our solution is shown to solve the task allocation problem to 6.7% of the theoretical optimal within the system configurations considered. It provides 5x better performance recovery over no-knowledge retention approaches when system connectivity is impacted, and is tested against systems up to 100 agents with less than a 9% impact on the algorithms' performance.