Software Vulnerability Prediction (SVP) is a data-driven technique for software quality assurance that has recently gained considerable attention in the Software Engineering research community. However, the difficulties of preparing Software Vulnerability (SV) related data remains as the main barrier to industrial adoption. Despite this problem, there have been no systematic efforts to analyse the existing SV data preparation techniques and challenges. Without such insights, we are unable to overcome the challenges and advance this research domain. Hence, we are motivated to conduct a Systematic Literature Review (SLR) of SVP research to synthesize and gain an understanding of the data considerations, challenges and solutions that SVP researchers provide. From our set of primary studies, we identify the main practices for each data preparation step. We then present a taxonomy of 16 key data challenges relating to six themes, which we further map to six categories of solutions. However, solutions are far from complete, and there are several ill-considered issues. We also provide recommendations for future areas of SV data research. Our findings help illuminate the key SV data practices and considerations for SVP researchers and practitioners, as well as inform the validity of the current SVP approaches.
Detecting, predicting, and alleviating traffic congestion are targeted at improving the level of service of the transportation network. With increasing access to larger datasets of higher resolution, the relevance of deep learning for such tasks is increasing. Several comprehensive survey papers in recent years have summarised the deep learning applications in the transportation domain. However, the system dynamics of the transportation network vary greatly between the non-congested state and the congested state -- thereby necessitating the need for a clear understanding of the challenges specific to congestion prediction. In this survey, we present the current state of deep learning applications in the tasks related to detection, prediction, and alleviation of congestion. Recurring and non-recurring congestion are discussed separately. Our survey leads us to uncover inherent challenges and gaps in the current state of research. Finally, we present some suggestions for future research directions as answers to the identified challenges.
Literature reviews have long played a fundamental role in synthesizing the current state of a research field. However, in recent years, certain fields have evolved at such a rapid rate that literature reviews quickly lose their relevance as new work is published that renders them outdated. We should therefore rethink how to structure and publish such literature reviews with their highly valuable synthesized content. Here, we aim to determine if existing Linked Data technologies can be harnessed to prolong the relevance of literature reviews and whether researchers are comfortable with working with such a solution. We present here our approach of ``living literature reviews'' where the core information is represented as Linked Data which can be amended with new findings after the publication of the literature review. We present a prototype implementation, which we use for a case study where we expose potential users to a concrete literature review modeled with our approach. We observe that our model is technically feasible and is received well by researchers, with our ``living'' versions scoring higher than their traditional counterparts in our user study. In conclusion, we find that there are strong benefits to using a Linked Data solution to extend the effective lifetime of a literature review.
The replication crisis is real, and awareness of its existence is growing across disciplines. We argue that research in human-computer interaction (HCI), and especially virtual reality (VR), is vulnerable to similar challenges due to many shared methodologies, theories, and incentive structures. For this reason, in this work, we transfer established solutions from other fields to address the lack of replicability and reproducibility in HCI and VR. We focus on reducing errors resulting from the so-called human factor and adapt established solutions to the specific needs of VR research. In addition, we present a toolkit to support the setup, execution, and evaluation of VR research. Some of the features aim to reduce human errors and thus improve replicability and reproducibility. Finally, the identified chances are applied to a typical scientific process in VR.
Predictive monitoring of business processes is concerned with the prediction of ongoing cases on a business process. Lately, the popularity of deep learning techniques has propitiated an ever-growing set of approaches focused on predictive monitoring based on these techniques. However, the high disparity of process logs and experimental setups used to evaluate these approaches makes it especially difficult to make a fair comparison. Furthermore, it also difficults the selection of the most suitable approach to solve a specific problem. In this paper, we provide both a systematic literature review of approaches that use deep learning to tackle the predictive monitoring tasks. In addition, we performed an exhaustive experimental evaluation of 10 different approaches over 12 publicly available process logs.
Cyber-physical systems (CPS) data privacy protection during sharing, aggregating, and publishing is a challenging problem. Several privacy protection mechanisms have been developed in the literature to protect sensitive data from adversarial analysis and eliminate the risk of re-identifying the original properties of shared data. However, most of the existing solutions have drawbacks, such as (i) lack of a proper vulnerability characterization model to accurately identify where privacy is needed, (ii) ignoring data providers privacy preference, (iii) using uniform privacy protection which may create inadequate privacy for some provider while overprotecting others, and (iv) lack of a comprehensive privacy quantification model assuring data privacy-preservation. To address these issues, we propose a personalized privacy preference framework by characterizing and quantifying the CPS vulnerabilities as well as ensuring privacy. First, we introduce a standard vulnerability profiling library (SVPL) by arranging the nodes of an energy-CPS from maximum to minimum vulnerable based on their privacy loss. Based on this model, we present our personalized privacy framework (PDP) in which Laplace noise is added based on the individual node's selected privacy preferences. Finally, combining these two proposed methods, we demonstrate that our privacy characterization and quantification model can attain better privacy preservation by eliminating the trade-off between privacy, utility, and risk of losing information.
In light of the emergence of deep reinforcement learning (DRL) in recommender systems research and several fruitful results in recent years, this survey aims to provide a timely and comprehensive overview of the recent trends of deep reinforcement learning in recommender systems. We start with the motivation of applying DRL in recommender systems. Then, we provide a taxonomy of current DRL-based recommender systems and a summary of existing methods. We discuss emerging topics and open issues, and provide our perspective on advancing the domain. This survey serves as introductory material for readers from academia and industry into the topic and identifies notable opportunities for further research.
Deep Learning (DL) is the most widely used tool in the contemporary field of computer vision. Its ability to accurately solve complex problems is employed in vision research to learn deep neural models for a variety of tasks, including security critical applications. However, it is now known that DL is vulnerable to adversarial attacks that can manipulate its predictions by introducing visually imperceptible perturbations in images and videos. Since the discovery of this phenomenon in 2013~[1], it has attracted significant attention of researchers from multiple sub-fields of machine intelligence. In [2], we reviewed the contributions made by the computer vision community in adversarial attacks on deep learning (and their defenses) until the advent of year 2018. Many of those contributions have inspired new directions in this area, which has matured significantly since witnessing the first generation methods. Hence, as a legacy sequel of [2], this literature review focuses on the advances in this area since 2018. To ensure authenticity, we mainly consider peer-reviewed contributions published in the prestigious sources of computer vision and machine learning research. Besides a comprehensive literature review, the article also provides concise definitions of technical terminologies for non-experts in this domain. Finally, this article discusses challenges and future outlook of this direction based on the literature reviewed herein and [2].
Conversational Machine Comprehension (CMC) is a research track in conversational AI which expects the machine to understand an open-domain text and thereafter engage in a multi-turn conversation to answer questions related to the text. While most of the research in Machine Reading Comprehension (MRC) revolves around single-turn question answering, multi-turn CMC has recently gained prominence, thanks to the advancement in natural language understanding via neural language models like BERT and the introduction of large-scale conversational datasets like CoQA and QuAC. The rise in interest has, however, led to a flurry of concurrent publications, each with a different yet structurally similar modeling approach and an inconsistent view of the surrounding literature. With the volume of model submissions to conversational datasets increasing every year, there exists a need to consolidate the scattered knowledge in this domain to streamline future research. This literature review, therefore, is a first-of-its-kind attempt at providing a holistic overview of CMC, with an emphasis on the common trends across recently published models, specifically in their approach to tackling conversational history. It focuses on synthesizing a generic framework for CMC models, rather than describing the models individually. The review is intended to serve as a compendium for future researchers in this domain.
The concept of smart grid has been introduced as a new vision of the conventional power grid to figure out an efficient way of integrating green and renewable energy technologies. In this way, Internet-connected smart grid, also called energy Internet, is also emerging as an innovative approach to ensure the energy from anywhere at any time. The ultimate goal of these developments is to build a sustainable society. However, integrating and coordinating a large number of growing connections can be a challenging issue for the traditional centralized grid system. Consequently, the smart grid is undergoing a transformation to the decentralized topology from its centralized form. On the other hand, blockchain has some excellent features which make it a promising application for smart grid paradigm. In this paper, we have an aim to provide a comprehensive survey on application of blockchain in smart grid. As such, we identify the significant security challenges of smart grid scenarios that can be addressed by blockchain. Then, we present a number of blockchain-based recent research works presented in different literatures addressing security issues in the area of smart grid. We also summarize several related practical projects, trials, and products that have been emerged recently. Finally, we discuss essential research challenges and future directions of applying blockchain to smart grid security issues.
Deep neural networks (DNN) have achieved unprecedented success in numerous machine learning tasks in various domains. However, the existence of adversarial examples has raised concerns about applying deep learning to safety-critical applications. As a result, we have witnessed increasing interests in studying attack and defense mechanisms for DNN models on different data types, such as images, graphs and text. Thus, it is necessary to provide a systematic and comprehensive overview of the main threats of attacks and the success of corresponding countermeasures. In this survey, we review the state of the art algorithms for generating adversarial examples and the countermeasures against adversarial examples, for the three popular data types, i.e., images, graphs and text.