Out-of-distribution (OOD) detection is critical for preventing deep learning models from making incorrect predictions to ensure the safety of artificial intelligence systems. Especially in safety-critical applications such as medical diagnosis and autonomous driving, the cost of incorrect decisions is usually unbearable. However, neural networks often suffer from the overconfidence issue, making high confidence for OOD data which are never seen during training process and may be irrelevant to training data, namely in-distribution (ID) data. Determining the reliability of the prediction is still a difficult and challenging task. In this work, we propose Uncertainty-Estimation with Normalized Logits (UE-NL), a robust learning method for OOD detection, which has three main benefits. (1) Neural networks with UE-NL treat every ID sample equally by predicting the uncertainty score of input data and the uncertainty is added into softmax function to adjust the learning strength of easy and hard samples during training phase, making the model learn robustly and accurately. (2) UE-NL enforces a constant vector norm on the logits to decouple the effect of the increasing output norm from optimization process, which causes the overconfidence issue to some extent. (3) UE-NL provides a new metric, the magnitude of uncertainty score, to detect OOD data. Experiments demonstrate that UE-NL achieves top performance on common OOD benchmarks and is more robust to noisy ID data that may be misjudged as OOD data by other methods.
Problems with solutions represented by permutations are very prominent in combinatorial optimization. Thus, in recent decades, a number of evolutionary algorithms have been proposed to solve them, and among them, those based on probability models have received much attention. In that sense, most efforts have focused on introducing algorithms that are suited for solving ordering/ranking nature problems. However, when it comes to proposing probability-based evolutionary algorithms for assignment problems, the works have not gone beyond proposing simple and in most cases univariate models. In this paper, we explore the use of Doubly Stochastic Matrices (DSM) for optimizing matching and assignment nature permutation problems. To that end, we explore some learning and sampling methods to efficiently incorporate DSMs within the picture of evolutionary algorithms. Specifically, we adopt the framework of estimation of distribution algorithms and compare DSMs to some existing proposals for permutation problems. Conducted preliminary experiments on instances of the quadratic assignment problem validate this line of research and show that DSMs may obtain very competitive results, while computational cost issues still need to be further investigated.
Out-of-distribution detection is one of the most critical issue in the deployment of machine learning. The data analyst must assure that data in operation should be compliant with the training phase as well as understand if the environment has changed in a way that autonomous decisions would not be safe anymore. The method of the paper is based on eXplainable Artificial Intelligence (XAI); it takes into account different metrics to identify any resemblance between in-distribution and out of, as seen by the XAI model. The approach is non-parametric and distributional assumption free. The validation over complex scenarios (predictive maintenance, vehicle platooning, covert channels in cybersecurity) corroborates both precision in detection and evaluation of training-operation conditions proximity. Results are available via open source and open data at the following link: //github.com/giacomo97cnr/Rule-based-ODD.
Cyber-physical systems (CPS) like autonomous vehicles, that utilize learning components, are often sensitive to noise and out-of-distribution (OOD) instances encountered during runtime. As such, safety critical tasks depend upon OOD detection subsystems in order to restore the CPS to a known state or interrupt execution to prevent safety from being compromised. However, it is difficult to guarantee the performance of OOD detectors as it is difficult to characterize the OOD aspect of an instance, especially in high-dimensional unstructured data. To distinguish between OOD data and data known to the learning component through the training process, an emerging technique is to incorporate variational autoencoders (VAE) within systems and apply classification or anomaly detection techniques on their latent spaces. The rationale for doing so is the reduction of the data domain size through the encoding process, which benefits real-time systems through decreased processing requirements, facilitates feature analysis for unstructured data and allows more explainable techniques to be implemented. This study places probably approximately correct (PAC) based guarantees on OOD detection using the encoding process within VAEs to quantify image features and apply conformal constraints over them. This is used to bound the detection error on unfamiliar instances with user-defined confidence. The approach used in this study is to empirically establish these bounds by sampling the latent probability distribution and evaluating the error with respect to the constraint violations that are encountered. The guarantee is then verified using data generated from CARLA, an open-source driving simulator.
Neural processes (NPs) have brought the representation power of parametric deep neural networks and the reliable uncertainty estimation of non-parametric Gaussian processes together. Although recent development of NPs has shown success in both regression and classification, how to adapt NPs to multimodal data has not be carefully studied. For the first time, we propose a new model of NP family for multimodal uncertainty estimation, namely Multimodal Neural Processes. In a holistic and principled way, we develop a dynamic context memory updated by the classification error, a multimodal Bayesian aggregation mechanism to aggregate multimodal representations, and a new attention mechanism for calibrated predictions. In extensive empirical evaluation, our method achieves the state-of-the-art multimodal uncertainty estimation performance, showing its appealing ability of being robust against noisy samples and reliable in out-of-domain detection.
This study addresses a class of linear mixed-integer programming (MIP) problems that involve uncertainty in the objective function coefficients. The coefficients are assumed to form a random vector, which probability distribution can only be observed through a finite training data set. Unlike most of the related studies in the literature, we also consider uncertainty in the underlying data set. The data uncertainty is described by a set of linear constraints for each random sample, and the uncertainty in the distribution (for a fixed realization of data) is defined using a type-1 Wasserstein ball centered at the empirical distribution of the data. The overall problem is formulated as a three-level distributionally robust optimization (DRO) problem. We prove that for a class of bi-affine loss functions the three-level problem admits a linear MIP reformulation. Furthermore, it turns out that in several important particular cases the three-level problem can be solved reasonably fast by leveraging the nominal MIP problem. Finally, we conduct a computational study, where the out-of-sample performance of our model and computational complexity of the proposed MIP reformulation are explored numerically for several application domains.
Analyzing the distribution shift of data is a growing research direction in nowadays Machine Learning (ML), leading to emerging new benchmarks that focus on providing a suitable scenario for studying the generalization properties of ML models. The existing benchmarks are focused on supervised learning, and to the best of our knowledge, there is none for unsupervised learning. Therefore, we introduce an unsupervised anomaly detection benchmark with data that shifts over time, built over Kyoto-2006+, a traffic dataset for network intrusion detection. This type of data meets the premise of shifting the input distribution: it covers a large time span ($10$ years), with naturally occurring changes over time (eg users modifying their behavior patterns, and software updates). We first highlight the non-stationary nature of the data, using a basic per-feature analysis, t-SNE, and an Optimal Transport approach for measuring the overall distribution distances between years. Next, we propose AnoShift, a protocol splitting the data in IID, NEAR, and FAR testing splits. We validate the performance degradation over time with diverse models, ranging from classical approaches to deep learning. Finally, we show that by acknowledging the distribution shift problem and properly addressing it, the performance can be improved compared to the classical training which assumes independent and identically distributed data (on average, by up to $3\%$ for our approach). Dataset and code are available at //github.com/bit-ml/AnoShift/.
Exploring data is crucial in data analysis, as it helps users understand and interpret the data more effectively. However, performing effective data exploration requires in-depth knowledge of the dataset and expertise in data analysis techniques. Not being familiar with either can create obstacles that make the process time-consuming and overwhelming for data analysts. To address this issue, we introduce InsightPilot, an LLM (Large Language Model)-based, automated data exploration system designed to simplify the data exploration process. InsightPilot automatically selects appropriate analysis intents, such as understanding, summarizing, and explaining. Then, these analysis intents are concretized by issuing corresponding intentional queries (IQueries) to create a meaningful and coherent exploration sequence. In brief, an IQuery is an abstraction and automation of data analysis operations, which mimics the approach of data analysts and simplifies the exploration process for users. By employing an LLM to iteratively collaborate with a state-of-the-art insight engine via IQueries, InsightPilot is effective in analyzing real-world datasets, enabling users to gain valuable insights through natural language inquiries. We demonstrate the effectiveness of InsightPilot in a case study, showing how it can help users gain valuable insights from their datasets.
We present prompt distribution learning for effectively adapting a pre-trained vision-language model to address downstream recognition tasks. Our method not only learns low-bias prompts from a few samples but also captures the distribution of diverse prompts to handle the varying visual representations. In this way, we provide high-quality task-related content for facilitating recognition. This prompt distribution learning is realized by an efficient approach that learns the output embeddings of prompts instead of the input embeddings. Thus, we can employ a Gaussian distribution to model them effectively and derive a surrogate loss for efficient training. Extensive experiments on 12 datasets demonstrate that our method consistently and significantly outperforms existing methods. For example, with 1 sample per category, it relatively improves the average result by 9.1% compared to human-crafted prompts.
Out-of-distribution (OOD) detection is critical to ensuring the reliability and safety of machine learning systems. For instance, in autonomous driving, we would like the driving system to issue an alert and hand over the control to humans when it detects unusual scenes or objects that it has never seen before and cannot make a safe decision. This problem first emerged in 2017 and since then has received increasing attention from the research community, leading to a plethora of methods developed, ranging from classification-based to density-based to distance-based ones. Meanwhile, several other problems are closely related to OOD detection in terms of motivation and methodology. These include anomaly detection (AD), novelty detection (ND), open set recognition (OSR), and outlier detection (OD). Despite having different definitions and problem settings, these problems often confuse readers and practitioners, and as a result, some existing studies misuse terms. In this survey, we first present a generic framework called generalized OOD detection, which encompasses the five aforementioned problems, i.e., AD, ND, OSR, OOD detection, and OD. Under our framework, these five problems can be seen as special cases or sub-tasks, and are easier to distinguish. Then, we conduct a thorough review of each of the five areas by summarizing their recent technical developments. We conclude this survey with open challenges and potential research directions.
User engagement is a critical metric for evaluating the quality of open-domain dialogue systems. Prior work has focused on conversation-level engagement by using heuristically constructed features such as the number of turns and the total time of the conversation. In this paper, we investigate the possibility and efficacy of estimating utterance-level engagement and define a novel metric, {\em predictive engagement}, for automatic evaluation of open-domain dialogue systems. Our experiments demonstrate that (1) human annotators have high agreement on assessing utterance-level engagement scores; (2) conversation-level engagement scores can be predicted from properly aggregated utterance-level engagement scores. Furthermore, we show that the utterance-level engagement scores can be learned from data. These scores can improve automatic evaluation metrics for open-domain dialogue systems, as shown by correlation with human judgements. This suggests that predictive engagement can be used as a real-time feedback for training better dialogue models.