Uncertainty quantification (UQ) is an active area of research, and an essential technique used in all fields of science and engineering. The most common methods for UQ are Monte Carlo and surrogate-modelling. The former method is dimensionality independent but has slow convergence, while the latter method has been shown to yield large computational speedups with respect to Monte Carlo. However, surrogate models suffer from the so-called curse of dimensionality, and become costly to train for high-dimensional problems, where UQ might become computationally prohibitive. In this paper we present a new technique, Lasso Monte Carlo (LMC), which combines a Lasso surrogate model with the multifidelity Monte Carlo technique, in order to perform UQ in high-dimensional settings, at a reduced computational cost. We provide mathematical guarantees for the unbiasedness of the method, and show that LMC can be more accurate than simple Monte Carlo. The theory is numerically tested with benchmarks on toy problems, as well as on a real example of UQ from the field of nuclear engineering. In all presented examples LMC is more accurate than simple Monte Carlo and other multifidelity methods. Thanks to LMC, computational costs are reduced by more than a factor of 5 with respect to simple MC, in relevant cases.
There is a wide availability of methods for testing normality under the assumption of independent and identically distributed data. When data are dependent in space and/or time, however, assessing and testing the marginal behavior is considerably more challenging, as the marginal behavior is impacted by the degree of dependence. We propose a new approach to assess normality for dependent data by non-linearly incorporating existing statistics from normality tests as well as sample moments such as skewness and kurtosis through a neural network. We calibrate (deep) neural networks by simulated normal and non-normal data with a wide range of dependence structures and we determine the probability of rejecting the null hypothesis. We compare several approaches for normality tests and demonstrate the superiority of our method in terms of statistical power through an extensive simulation study. A real world application to global temperature data further demonstrates how the degree of spatio-temporal aggregation affects the marginal normality in the data.
We develop a Bayesian inference method for discretely-observed stochastic differential equations (SDEs). Inference is challenging for most SDEs, due to the analytical intractability of the likelihood function. Nevertheless, forward simulation via numerical methods is straightforward, motivating the use of approximate Bayesian computation (ABC). We propose a conditional simulation scheme for SDEs that is based on lookahead strategies for sequential Monte Carlo (SMC) and particle smoothing using backward simulation. This leads to the simulation of trajectories that are consistent with the observed trajectory, thereby increasing the ABC acceptance rate. We additionally employ an invariant neural network, previously developed for Markov processes, to learn the summary statistics function required in ABC. The neural network is incrementally retrained by exploiting an ABC-SMC sampler, which provides new training data at each round. Since the SDE simulation scheme differs from standard forward simulation, we propose a suitable importance sampling correction, which has the added advantage of guiding the parameters towards regions of high posterior density, especially in the first ABC-SMC round. Our approach achieves accurate inference and is about three times faster than standard (forward-only) ABC-SMC. We illustrate our method in four simulation studies, including three examples from the Chan-Karaolyi-Longstaff-Sanders SDE family.
The dynamic complexity of robots and mechatronic systems often pertains to the hybrid nature of dynamics, where governing equations consist of heterogenous equations that are switched depending on the state of the system. Legged robots and manipulator robots experience contact-noncontact discrete transitions, causing switching of governing equations. Analysis of these systems have been a challenge due to the lack of a global, unified model that is amenable to analysis of the global behaviors. Composition operator theory has the potential to provide a global, unified representation by converting them to linear dynamical systems in a lifted space. The current work presents a method for encoding nonlinear heterogenous dynamics into a high dimensional space of observables in the form of Koopman operator. First, a new formula is established for representing the Koopman operator in a Hilbert space by using inner products of observable functions and their composition with the governing state transition function. This formula, called Direct Encoding, allows for converting a class of heterogenous systems directly to a global, unified linear model. Unlike prevalent data-driven methods, where results can vary depending on numerical data, the proposed method is globally valid, not requiring numerical simulation of the original dynamics. A simple example validates the theoretical results, and the method is applied to a multi-cable suspension system.
Large Language Models (LLMs) have shown promise in the autonomous driving sector, particularly in generalization and interpretability. We introduce a unique object-level multimodal LLM architecture that merges vectorized numeric modalities with a pre-trained LLM to improve context understanding in driving situations. We also present a new dataset of 160k QA pairs derived from 10k driving scenarios, paired with high quality control commands collected with RL agent and question answer pairs generated by teacher LLM (GPT-3.5). A distinct pretraining strategy is devised to align numeric vector modalities with static LLM representations using vector captioning language data. We also introduce an evaluation metric for Driving QA and demonstrate our LLM-driver's proficiency in interpreting driving scenarios, answering questions, and decision-making. Our findings highlight the potential of LLM-based driving action generation in comparison to traditional behavioral cloning. We make our benchmark, datasets, and model available for further exploration.
The globally convergent convexification numerical method is constructed for a Coefficient Inverse Problem for the Mean Field Games System. A coefficient characterizing the global interaction term is recovered from the single measurement data. In particular, a new Carleman estimate for the Volterra integral operator is proven, and it stronger than the previously known one. Numerical results demonstrate accurate reconstructions from noisy data.
Assessing the quality and impact of individual data points is critical for improving model performance and mitigating undesirable biases within the training dataset. Several data valuation algorithms have been proposed to quantify data quality, however, there lacks a systemic and standardized benchmarking system for data valuation. In this paper, we introduce OpenDataVal, an easy-to-use and unified benchmark framework that empowers researchers and practitioners to apply and compare various data valuation algorithms. OpenDataVal provides an integrated environment that includes (i) a diverse collection of image, natural language, and tabular datasets, (ii) implementations of eleven different state-of-the-art data valuation algorithms, and (iii) a prediction model API that can import any models in scikit-learn. Furthermore, we propose four downstream machine learning tasks for evaluating the quality of data values. We perform benchmarking analysis using OpenDataVal, quantifying and comparing the efficacy of state-of-the-art data valuation approaches. We find that no single algorithm performs uniformly best across all tasks, and an appropriate algorithm should be employed for a user's downstream task. OpenDataVal is publicly available at //opendataval.github.io with comprehensive documentation. Furthermore, we provide a leaderboard where researchers can evaluate the effectiveness of their own data valuation algorithms.
Case-based reasoning (CBR) as a methodology for problem-solving can use any appropriate computational technique. This position paper argues that CBR researchers have somewhat overlooked recent developments in deep learning and large language models (LLMs). The underlying technical developments that have enabled the recent breakthroughs in AI have strong synergies with CBR and could be used to provide a persistent memory for LLMs to make progress towards Artificial General Intelligence.
In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years. Given the increasing amount of research work in this area, several KG-related approaches have been surveyed in the NLP research community. However, a comprehensive study that categorizes established topics and reviews the maturity of individual research streams remains absent to this day. Contributing to closing this gap, we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work.
The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving.
Detecting carried objects is one of the requirements for developing systems to reason about activities involving people and objects. We present an approach to detect carried objects from a single video frame with a novel method that incorporates features from multiple scales. Initially, a foreground mask in a video frame is segmented into multi-scale superpixels. Then the human-like regions in the segmented area are identified by matching a set of extracted features from superpixels against learned features in a codebook. A carried object probability map is generated using the complement of the matching probabilities of superpixels to human-like regions and background information. A group of superpixels with high carried object probability and strong edge support is then merged to obtain the shape of the carried object. We applied our method to two challenging datasets, and results show that our method is competitive with or better than the state-of-the-art.