Runtime verification or runtime monitoring equips safety-critical cyber-physical systems to augment design assurance measures and ensure operational safety and security. Cyber-physical systems have interaction failures, attack surfaces, and attack vectors resulting in unanticipated hazards and loss scenarios. These interaction failures pose challenges to runtime verification regarding monitoring specifications and monitoring placements for in-time detection of hazards. We develop a well-formed workflow model that connects system theoretic process analysis, commonly referred to as STPA, hazard causation information to lower-level runtime monitoring to detect hazards at the operational phase. Specifically, our model follows the DepDevOps paradigm to provide evidence and insights to runtime monitoring on what to monitor, where to monitor, and the monitoring context. We demonstrate and evaluate the value of multilevel monitors by injecting hazards on an autonomous emergency braking system model.
Context: A tertiary study can be performed to identify related reviews on a topic of interest. However, the elaboration of an appropriate and effective search string to detect secondary studies is challenging for Software Engineering (SE) researchers. Objective: The main goal of this study is to propose a suitable search string to detect secondary studies in SE, addressing issues such as the quantity of applied terms, relevance, recall and precision. Method: We analyzed seven tertiary studies under two perspectives: (1) structure -- strings' terms to detect secondary studies; and (2) field: where searching -- titles alone or abstracts alone or titles and abstracts together, among others. We validate our string by performing a two-step validation process. Firstly, we evaluated the capability to retrieve secondary studies over a set of 1537 secondary studies included in 24 tertiary studies in SE. Secondly, we evaluated the general capacity of retrieving secondary studies over an automated search using the Scopus digital library. Results: Our string was capable to retrieve an optimum value of over 90\% of the included secondary studies (recall) with a high general precision of almost 60\%. Conclusion: The suitable search string for finding secondary studies in SE contains the terms "systematic review", "literature review", "systematic mapping", "mapping study" and "systematic map".
Chest X-ray (CXR) is the most typical radiological exam for diagnosis of various diseases. Due to the expensive and time-consuming annotations, detecting anomalies in CXRs in an unsupervised fashion is very promising. However, almost all of the existing methods consider anomaly detection as a One-Class Classification (OCC) problem. They model the distribution of only known normal images during training and identify the samples not conforming to normal profile as anomalies in the testing phase. A large number of unlabeled images containing anomalies are thus ignored in the training phase, although they are easy to obtain in clinical practice. In this paper, we propose a novel strategy, Dual-distribution Discrepancy for Anomaly Detection (DDAD), utilizing both known normal images and unlabeled images. The proposed method consists of two modules, denoted as A and B. During training, module A takes both known normal and unlabeled images as inputs, capturing anomalous features from unlabeled images in some way, while module B models the distribution of only known normal images. Subsequently, the inter-discrepancy between modules A and B, and intra-discrepancy inside module B are designed as anomaly scores to indicate anomalies. Experiments on three CXR datasets demonstrate that the proposed DDAD achieves consistent, significant gains and outperforms state-of-the-art methods. Code is available at //github.com/caiyu6666/DDAD.
The globalization of the electronics supply chain requires effective methods to thwart reverse engineering and IP theft. Logic locking is a promising solution, but there are many open concerns. First, even when applied at a higher level of abstraction, locking may result in significant overhead without improving the security metric. Second, optimizing a security metric is application-dependent and designers must evaluate and compare alternative solutions. We propose a meta-framework to optimize the use of behavioral locking during the high-level synthesis (HLS) of IP cores. Our method operates on chip's specification (before HLS) and it is compatible with all HLS tools, complementing industrial EDA flows. Our meta-framework supports different strategies to explore the design space and to select points to be locked automatically. We evaluated our method on the optimization of differential entropy, achieving better results than random or topological locking: 1) we always identify a valid solution that optimizes the security metric, while topological and random locking can generate unfeasible solutions; 2) we minimize the number of bits used for locking up to more than 90% (requiring smaller tamper-proof memories); 3) we make better use of hardware resources since we obtain similar overheads but with higher security metric.
Modern vehicles rely on a fleet of electronic control units (ECUs) connected through controller area network (CAN) buses for critical vehicular control. However, with the expansion of advanced connectivity features in automobiles and the elevated risks of internal system exposure, the CAN bus is increasingly prone to intrusions and injection attacks. The ordinary injection attacks disrupt the typical timing properties of the CAN data stream, and the rule-based intrusion detection systems (IDS) can easily detect them. However, advanced attackers can inject false data to the time series sensory data (signal), while looking innocuous by the pattern/frequency of the CAN messages. Such attacks can bypass the rule-based IDS or any anomaly-based IDS built on binary payload data. To make the vehicles robust against such intelligent attacks, we propose CANShield, a signal-based intrusion detection framework for the CAN bus. CANShield consists of three modules: a data preprocessing module that handles the high-dimensional CAN data stream at the signal level and makes them suitable for a deep learning model; a data analyzer module consisting of multiple deep autoencoder (AE) networks, each analyzing the time-series data from a different temporal perspective; and finally an attack detection module that uses an ensemble method to make the final decision. Evaluation results on two high-fidelity signal-based CAN attack datasets show the high accuracy and responsiveness of CANShield in detecting wide-range of advanced intrusion attacks.
Weakly-Supervised Object Detection (WSOD) and Localization (WSOL), i.e., detecting multiple and single instances with bounding boxes in an image using image-level labels, are long-standing and challenging tasks in the CV community. With the success of deep neural networks in object detection, both WSOD and WSOL have received unprecedented attention. Hundreds of WSOD and WSOL methods and numerous techniques have been proposed in the deep learning era. To this end, in this paper, we consider WSOL is a sub-task of WSOD and provide a comprehensive survey of the recent achievements of WSOD. Specifically, we firstly describe the formulation and setting of the WSOD, including the background, challenges, basic framework. Meanwhile, we summarize and analyze all advanced techniques and training tricks for improving detection performance. Then, we introduce the widely-used datasets and evaluation metrics of WSOD. Lastly, we discuss the future directions of WSOD. We believe that these summaries can help pave a way for future research on WSOD and WSOL.
Applying artificial intelligence techniques in medical imaging is one of the most promising areas in medicine. However, most of the recent success in this area highly relies on large amounts of carefully annotated data, whereas annotating medical images is a costly process. In this paper, we propose a novel method, called FocalMix, which, to the best of our knowledge, is the first to leverage recent advances in semi-supervised learning (SSL) for 3D medical image detection. We conducted extensive experiments on two widely used datasets for lung nodule detection, LUNA16 and NLST. Results show that our proposed SSL methods can achieve a substantial improvement of up to 17.3% over state-of-the-art supervised learning approaches with 400 unlabeled CT scans.
It is a common paradigm in object detection frameworks to treat all samples equally and target at maximizing the performance on average. In this work, we revisit this paradigm through a careful study on how different samples contribute to the overall performance measured in terms of mAP. Our study suggests that the samples in each mini-batch are neither independent nor equally important, and therefore a better classifier on average does not necessarily mean higher mAP. Motivated by this study, we propose the notion of Prime Samples, those that play a key role in driving the detection performance. We further develop a simple yet effective sampling and learning strategy called PrIme Sample Attention (PISA) that directs the focus of the training process towards such samples. Our experiments demonstrate that it is often more effective to focus on prime samples than hard samples when training a detector. Particularly, On the MSCOCO dataset, PISA outperforms the random sampling baseline and hard mining schemes, e.g. OHEM and Focal Loss, consistently by more than 1% on both single-stage and two-stage detectors, with a strong backbone ResNeXt-101.
The prevalence of networked sensors and actuators in many real-world systems such as smart buildings, factories, power plants, and data centers generate substantial amounts of multivariate time series data for these systems. The rich sensor data can be continuously monitored for intrusion events through anomaly detection. However, conventional threshold-based anomaly detection methods are inadequate due to the dynamic complexities of these systems, while supervised machine learning methods are unable to exploit the large amounts of data due to the lack of labeled data. On the other hand, current unsupervised machine learning approaches have not fully exploited the spatial-temporal correlation and other dependencies amongst the multiple variables (sensors/actuators) in the system for detecting anomalies. In this work, we propose an unsupervised multivariate anomaly detection method based on Generative Adversarial Networks (GANs). Instead of treating each data stream independently, our proposed MAD-GAN framework considers the entire variable set concurrently to capture the latent interactions amongst the variables. We also fully exploit both the generator and discriminator produced by the GAN, using a novel anomaly score called DR-score to detect anomalies by discrimination and reconstruction. We have tested our proposed MAD-GAN using two recent datasets collected from real-world CPS: the Secure Water Treatment (SWaT) and the Water Distribution (WADI) datasets. Our experimental results showed that the proposed MAD-GAN is effective in reporting anomalies caused by various cyber-intrusions compared in these complex real-world systems.
Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.
Most of the internet today is composed of digital media that includes videos and images. With pixels becoming the currency in which most transactions happen on the internet, it is becoming increasingly important to have a way of browsing through this ocean of information with relative ease. YouTube has 400 hours of video uploaded every minute and many million images are browsed on Instagram, Facebook, etc. Inspired by recent advances in the field of deep learning and success that it has gained on various problems like image captioning and, machine translation , word2vec , skip thoughts, etc, we present DeepSeek a natural language processing based deep learning model that allows users to enter a description of the kind of images that they want to search, and in response the system retrieves all the images that semantically and contextually relate to the query. Two approaches are described in the following sections.