Cooperative driving is an emerging paradigm to enhance the safety and efficiency of autonomous vehicles. To ensure successful cooperation, road users must reach a consensus for making collective decisions, while recording vehicular data to analyze and address failures related to such agreements. This data has the potential to provide valuable insights into various vehicular events, while also potentially improving accountability measures. Furthermore, vehicles may benefit from the ability to negotiate and trade services among themselves, adding value to the cooperative driving framework. However, the majority of proposed systems aiming to ensure data security, consensus, or service trading, lack efficient and thoroughly validated mechanisms that consider the distinctive characteristics of vehicular networks. These limitations are amplified by a dependency on the centralized support provided by the infrastructure. Furthermore, corresponding mechanisms must diligently address security concerns, especially regarding potential malicious or misbehaving nodes, while also considering inherent constraints of the wireless medium. We introduce the Verifiable Event Extension (VEE), an applicational extension designed for Intelligent Transportation System (ITS) messages. The VEE operates seamlessly with any existing standardized vehicular communications protocol, addressing crucial aspects of data security, consensus, and trading with minimal overhead. To achieve this, we employ blockchain techniques, Byzantine fault tolerance (BFT) consensus protocols, and cryptocurrency-based mechanics. To assess our proposal's feasibility and lightweight nature, we employed a hardware-in-the-loop setup for analysis. Experimental results demonstrate the viability and efficiency of the VEE extension in overcoming the challenges posed by the distributed and opportunistic nature of wireless vehicular communications.
Neural networks are successful in various applications but are also susceptible to adversarial attacks. To show the safety of network classifiers, many verifiers have been introduced to reason about the local robustness of a given input to a given perturbation. While successful, local robustness cannot generalize to unseen inputs. Several works analyze global robustness properties, however, neither can provide a precise guarantee about the cases where a network classifier does not change its classification. In this work, we propose a new global robustness property for classifiers aiming at finding the minimal globally robust bound, which naturally extends the popular local robustness property for classifiers. We introduce VHAGaR, an anytime verifier for computing this bound. VHAGaR relies on three main ideas: encoding the problem as a mixed-integer programming and pruning the search space by identifying dependencies stemming from the perturbation or the network's computation and generalizing adversarial attacks to unknown inputs. We evaluate VHAGaR on several datasets and classifiers and show that, given a three hour timeout, the average gap between the lower and upper bound on the minimal globally robust bound computed by VHAGaR is 1.9, while the gap of an existing global robustness verifier is 154.7. Moreover, VHAGaR is 130.6x faster than this verifier. Our results further indicate that leveraging dependencies and adversarial attacks makes VHAGaR 78.6x faster.
The safety of autonomous vehicles (AV) has been a long-standing top concern, stemming from the absence of rare and safety-critical scenarios in the long-tail naturalistic driving distribution. To tackle this challenge, a surge of research in scenario-based autonomous driving has emerged, with a focus on generating high-risk driving scenarios and applying them to conduct safety-critical testing of AV models. However, limited work has been explored on the reuse of these extensive scenarios to iteratively improve AV models. Moreover, it remains intractable and challenging to filter through gigantic scenario libraries collected from other AV models with distinct behaviors, attempting to extract transferable information for current AV improvement. Therefore, we develop a continual driving policy optimization framework featuring Closed-Loop Individualized Curricula (CLIC), which we factorize into a set of standardized sub-modules for flexible implementation choices: AV Evaluation, Scenario Selection, and AV Training. CLIC frames AV Evaluation as a collision prediction task, where it estimates the chance of AV failures in these scenarios at each iteration. Subsequently, by re-sampling from historical scenarios based on these failure probabilities, CLIC tailors individualized curricula for downstream training, aligning them with the evaluated capability of AV. Accordingly, CLIC not only maximizes the utilization of the vast pre-collected scenario library for closed-loop driving policy optimization but also facilitates AV improvement by individualizing its training with more challenging cases out of those poorly organized scenarios. Experimental results clearly indicate that CLIC surpasses other curriculum-based training strategies, showing substantial improvement in managing risky scenarios, while still maintaining proficiency in handling simpler cases.
Timed automata are the formal model for real-time systems. Extensions with discrete probabilistic branching have been considered in the literature and successfully applied. Probabilistic timed automata (PTA) do require all branching probabilities and clock constraints to be constants. This report investigates PTA in which this constraint is relaxed: both branching probabilities and clock constraints can be parametric. We formally define this PTA variant and define its semantics by an uncountable parametric Markov Decision Process (pMDP). We show that reachability probabilities in parametric L/U-PTA can be reduced to considering PTA with only parametric branching probabilities. This enables the usage of existing techniques from the literature. Finally, we generalize the symbolic backward and digital clock semantics of PTA to the setting with parametric probabilities and constraints.
An assurance case has become an integral component for the certification of safety-critical systems. While manually defining assurance case patterns can be not avoided, system-specific instantiations of assurance case patterns are both costly and time-consuming. It becomes especially complex to maintain an assurance case for a system when the requirements of the System-Under-Assurance change, or an assurance claim becomes invalid due to, e.g., degradation of a systems component, as common when deploying learning-enabled components. In this paper, we report on our preliminary experience leveraging the tool integration framework Evidential Tool Bus (ETB) for the construction and continuous maintenance of an assurance case from a predefined assurance case pattern. Specifically, we demonstrate the assurance process on an industrial Automated Valet Parking system from the automotive domain. We present the formalization of the provided assurance case pattern in the ETB processable logical specification language of workflows. Our findings show that ETB is able to create and maintain evidence required for the construction of an assurance case.
In the rapidly evolving landscape of Large Language Models (LLMs), ensuring robust safety measures is paramount. To meet this crucial need, we propose \emph{SALAD-Bench}, a safety benchmark specifically designed for evaluating LLMs, attack, and defense methods. Distinguished by its breadth, SALAD-Bench transcends conventional benchmarks through its large scale, rich diversity, intricate taxonomy spanning three levels, and versatile functionalities.SALAD-Bench is crafted with a meticulous array of questions, from standard queries to complex ones enriched with attack, defense modifications and multiple-choice. To effectively manage the inherent complexity, we introduce an innovative evaluators: the LLM-based MD-Judge for QA pairs with a particular focus on attack-enhanced queries, ensuring a seamless, and reliable evaluation. Above components extend SALAD-Bench from standard LLM safety evaluation to both LLM attack and defense methods evaluation, ensuring the joint-purpose utility. Our extensive experiments shed light on the resilience of LLMs against emerging threats and the efficacy of contemporary defense tactics. Data and evaluator are released under //github.com/OpenSafetyLab/SALAD-BENCH.
Cooperative perception offers several benefits for enhancing the capabilities of autonomous vehicles and improving road safety. Using roadside sensors in addition to onboard sensors increases reliability and extends the sensor range. External sensors offer higher situational awareness for automated vehicles and prevent occlusions. We propose CoopDet3D, a cooperative multi-modal fusion model, and TUMTraf-V2X, a perception dataset, for the cooperative 3D object detection and tracking task. Our dataset contains 2,000 labeled point clouds and 5,000 labeled images from five roadside and four onboard sensors. It includes 30k 3D boxes with track IDs and precise GPS and IMU data. We labeled eight categories and covered occlusion scenarios with challenging driving maneuvers, like traffic violations, near-miss events, overtaking, and U-turns. Through multiple experiments, we show that our CoopDet3D camera-LiDAR fusion model achieves an increase of +14.36 3D mAP compared to a vehicle camera-LiDAR fusion model. Finally, we make our dataset, model, labeling tool, and dev-kit publicly available on our website: //tum-traffic-dataset.github.io/tumtraf-v2x.
RGBT multispectral pedestrian detection has emerged as a promising solution for safety-critical applications that require day/night operations. However, the modality bias problem remains unsolved as multispectral pedestrian detectors learn the statistical bias in datasets. Specifically, datasets in multispectral pedestrian detection mainly distribute between ROTO (day) and RXTO (night) data; the majority of the pedestrian labels statistically co-occur with their thermal features. As a result, multispectral pedestrian detectors show poor generalization ability on examples beyond this statistical correlation, such as ROTX data. To address this problem, we propose a novel Causal Mode Multiplexer (CMM) framework that effectively learns the causalities between multispectral inputs and predictions. Moreover, we construct a new dataset (ROTX-MP) to evaluate modality bias in multispectral pedestrian detection. ROTX-MP mainly includes ROTX examples not presented in previous datasets. Extensive experiments demonstrate that our proposed CMM framework generalizes well on existing datasets (KAIST, CVC-14, FLIR) and the new ROTX-MP. We will release our new dataset to the public for future research.
Autonomous vehicles have been actively investigated over the past few decades. Several recent works show the potential of autonomous driving transportation services in urban environments with impressive experimental results. However, these works note that autonomous vehicles are still occasionally inferior to expert drivers in complex scenarios. Furthermore, they do not focus on the possibilities of autonomous driving transportation services in other areas beyond urban environments. This paper presents the research results and lessons learned from autonomous driving transportation services in airfield, crowded indoor, and urban environments. We discuss how we address several unique challenges in these diverse environments. We also offer an overview of remaining challenges that have not received much attention but must be addressed. This paper aims to share our unique experience to support researchers who are interested in realizing the potential of autonomous vehicles in various real-world environments.
Unmanned aerial vehicles are becoming common and have many productive uses. However, their increased prevalence raises safety concerns -- how can we protect restricted airspace? Knowing the type of unmanned aerial vehicle can go a long way in determining any potential risks it carries. For instance, fixed-wing craft can carry more weight over longer distances, thus potentially posing a more significant threat. This paper presents a machine learning model for classifying unmanned aerial vehicles as quadrotor, hexarotor, or fixed-wing. Our approach effectively applies a Long-Short Term Memory (LSTM) neural network for the purpose of time series classification. We performed experiments to test the effects of changing the timestamp sampling method and addressing the imbalance in the class distribution. Through these experiments, we identified the top-performing sampling and class imbalance fixing methods. Averaging the macro f-scores across 10 folds of data, we found that the majority quadrotor class was predicted well (98.16%), and, despite an extreme class imbalance, the model could also predicted a majority of fixed-wing flights correctly (73.15%). Hexarotor instances were often misclassified as quadrotors due to the similarity of multirotors in general (42.15%). However, results remained relatively stable across certain methods, which prompted us to analyze and report on their tradeoffs. The supplemental material for this paper, including the code and data for running all the experiments and generating the results tables, is available at //osf.io/mnsgk/.
As highly automated vehicles reach higher deployment rates, they find themselves in increasingly dangerous situations. Knowing that the consequence of a crash is significant for the health of occupants, bystanders, and properties, as well as to the viability of autonomy and adjacent businesses, we must search for more efficacious ways to comprehensively and reliably train autonomous vehicles to better navigate the complex scenarios with which they struggle. We therefore introduce a taxonomy of potentially adversarial elements that may contribute to poor performance or system failures as a means of identifying and elucidating lesser-seen risks. This taxonomy may be used to characterize failures of automation, as well as to support simulation and real-world training efforts by providing a more comprehensive classification system for events resulting in disengagement, collision, or other negative consequences. This taxonomy is created from and tested against real collision events to ensure comprehensive coverage with minimal class overlap and few omissions. It is intended to be used both for the identification of harm-contributing adversarial events and in the generation thereof (to create extreme edge- and corner-case scenarios) in training procedures.