Code comments are important artifacts in software systems and play a paramount role in many software engineering (SE) tasks related to maintenance and program comprehension. However, while it is widely accepted that high quality matters in code comments just as it matters in source code, assessing comment quality in practice is still an open problem. First and foremost, there is no unique definition of quality when it comes to evaluating code comments. The few existing studies on this topic rather focus on specific attributes of quality that can be easily quantified and measured. Existing techniques and corresponding tools may also focus on comments bound to a specific programming language, and may only deal with comments with specific scopes and clear goals (e.g., Javadoc comments at the method level, or in-body comments describing TODOs to be addressed). In this paper, we present a Systematic Literature Review (SLR) of the last decade of research in SE to answer the following research questions: (i) What types of comments do researchers focus on when assessing comment quality? (ii) What quality attributes (QAs) do they consider? (iii) Which tools and techniques do they use to assess comment quality?, and (iv) How do they evaluate their studies on comment quality assessment in general? Our evaluation, based on the analysis of 2353 papers and the actual review of 47 relevant ones, shows that (i) most studies and techniques focus on comments in Java code, thus may not be generalizable to other languages, and (ii) the analyzed studies focus on four main QAs of a total of 21 QAs identified in the literature, with a clear predominance of checking consistency between comments and the code. We observe that researchers rely on manual assessment and specific heuristics rather than the automated assessment of the comment quality attributes.
Ensuring safety of the products offered to the customers is of paramount importance to any e- commerce platform. Despite stringent quality and safety checking of products listed on these platforms, occasionally customers might receive a product that can pose a safety issue arising out of its use. In this paper, we present an innovative mechanism of how a large scale multinational e-commerce platform, Zalando, uses Natural Language Processing techniques to assist timely investigation of the potentially unsafe products mined directly from customer written claims in unstructured plain text. We systematically describe the types of safety issues that concern Zalando customers. We demonstrate how we map this core business problem into a supervised text classification problem with highly imbalanced, noisy, multilingual data in a AI-in-the-loop setup with a focus on Key Performance Indicator (KPI) driven evaluation. Finally, we present detailed ablation studies to show a comprehensive comparison between different classification techniques. We conclude the work with how this NLP model was deployed.
Automotive user interfaces constantly change due to increasing automation, novel features, additional applications, and user demands. While in-vehicle interaction can utilize numerous promising modalities, no existing overview includes an extensive set of human sensors and actuators and interaction locations throughout the vehicle interior. We conducted a systematic literature review of 327 publications leading to a design space for in-vehicle interaction that outlines existing and lack of work regarding input and output modalities, locations, and multimodal interaction. To investigate user acceptance of possible modalities and locations inferred from existing work and gaps unveiled in our design space, we conducted an online study (N=48). The study revealed users' general acceptance of novel modalities (e.g., brain or thermal activity) and interaction with locations other than the front (e.g., seat or table). Our work helps practitioners evaluate key design decisions, exploit trends, and explore new areas in the domain of in-vehicle interaction.
Context: Developing software-intensive products or services usually involves a plethora of software artefacts. Assets are artefacts intended to be used more than once and have value for organisations; examples include test cases, code, requirements, and documentation. During the development process, assets might degrade, affecting the effectiveness and efficiency of the development process. Therefore, assets are an investment that requires continuous management. Identifying assets is the first step for their effective management. However, there is a lack of awareness of what assets and types of assets are common in software-developing organisations. Most types of assets are understudied, and their state of quality and how they degrade over time have not been well-understood. Method: We perform a systematic literature review and a field study at five companies to study and identify assets to fill the gap in research. The results were analysed qualitatively and summarised in a taxonomy. Results: We create the first comprehensive, structured, yet extendable taxonomy of assets, containing 57 types of assets. Conclusions: The taxonomy serves as a foundation for identifying assets that are relevant for an organisation and enables the study of asset management and asset degradation concepts.
There is a growing interest in understanding the energy and environmental footprint of digital currencies, specifically in cryptocurrencies such as Bitcoin and Ethereum. These cryptocurrencies are operated by a geographically distributed network of computing nodes, making it hard to accurately estimate their energy consumption. Existing studies, both in academia and industry, attempt to model the cryptocurrencies energy consumption often based on a number of assumptions for instance about the hardware in use or geographic distribution of the computing nodes. A number of these studies has already been widely criticized for their design choices and subsequent over or under-estimation of the energy use. In this study, we evaluate the reliability of prior models and estimates by leveraging existing scientific literature from fields cognizant of blockchain such as social energy sciences and information systems. We first design a quality assessment framework based on existing research, we then conduct a systematic literature review examining scientific and non-academic literature demonstrating common issues and potential avenues of addressing these issues. Our goal with this article is to to advance the field by promoting scientific rigor in studies focusing on Blockchain's energy footprint. To that end, we provide a novel set of codes of conduct for the five most widely used research methodologies: quantitative energy modeling, literature reviews, data analysis \& statistics, case studies, and experiments. We envision that these codes of conduct would assist in standardizing the design and assessment of studies focusing on blockchain-based systems' energy and environmental footprint.
In practically every industry today, artificial intelligence is one of the most effective ways for machines to assist humans. Since its inception, a large number of researchers throughout the globe have been pioneering the application of artificial intelligence in medicine. Although artificial intelligence may seem to be a 21st-century concept, Alan Turing pioneered the first foundation concept in the 1940s. Artificial intelligence in medicine has a huge variety of applications that researchers are continually exploring. The tremendous increase in computer and human resources has hastened progress in the 21st century, and it will continue to do so for many years to come. This review of the literature will highlight the emerging field of artificial intelligence in medicine and its current level of development.
Visual recognition is currently one of the most important and active research areas in computer vision, pattern recognition, and even the general field of artificial intelligence. It has great fundamental importance and strong industrial needs. Deep neural networks (DNNs) have largely boosted their performances on many concrete tasks, with the help of large amounts of training data and new powerful computation resources. Though recognition accuracy is usually the first concern for new progresses, efficiency is actually rather important and sometimes critical for both academic research and industrial applications. Moreover, insightful views on the opportunities and challenges of efficiency are also highly required for the entire community. While general surveys on the efficiency issue of DNNs have been done from various perspectives, as far as we are aware, scarcely any of them focused on visual recognition systematically, and thus it is unclear which progresses are applicable to it and what else should be concerned. In this paper, we present the review of the recent advances with our suggestions on the new possible directions towards improving the efficiency of DNN-related visual recognition approaches. We investigate not only from the model but also the data point of view (which is not the case in existing surveys), and focus on three most studied data types (images, videos and points). This paper attempts to provide a systematic summary via a comprehensive survey which can serve as a valuable reference and inspire both researchers and practitioners who work on visual recognition problems.
Deep Learning has implemented a wide range of applications and has become increasingly popular in recent years. The goal of multimodal deep learning is to create models that can process and link information using various modalities. Despite the extensive development made for unimodal learning, it still cannot cover all the aspects of human learning. Multimodal learning helps to understand and analyze better when various senses are engaged in the processing of information. This paper focuses on multiple types of modalities, i.e., image, video, text, audio, body gestures, facial expressions, and physiological signals. Detailed analysis of past and current baseline approaches and an in-depth study of recent advancements in multimodal deep learning applications has been provided. A fine-grained taxonomy of various multimodal deep learning applications is proposed, elaborating on different applications in more depth. Architectures and datasets used in these applications are also discussed, along with their evaluation metrics. Last, main issues are highlighted separately for each domain along with their possible future research directions.
Machine learning plays a role in many deployed decision systems, often in ways that are difficult or impossible to understand by human stakeholders. Explaining, in a human-understandable way, the relationship between the input and output of machine learning models is essential to the development of trustworthy machine-learning-based systems. A burgeoning body of research seeks to define the goals and methods of explainability in machine learning. In this paper, we seek to review and categorize research on counterfactual explanations, a specific class of explanation that provides a link between what could have happened had input to a model been changed in a particular way. Modern approaches to counterfactual explainability in machine learning draw connections to the established legal doctrine in many countries, making them appealing to fielded systems in high-impact areas such as finance and healthcare. Thus, we design a rubric with desirable properties of counterfactual explanation algorithms and comprehensively evaluate all currently-proposed algorithms against that rubric. Our rubric provides easy comparison and comprehension of the advantages and disadvantages of different approaches and serves as an introduction to major research themes in this field. We also identify gaps and discuss promising research directions in the space of counterfactual explainability.
Deep learning has penetrated all aspects of our lives and brought us great convenience. However, the process of building a high-quality deep learning system for a specific task is not only time-consuming but also requires lots of resources and relies on human expertise, which hinders the development of deep learning in both industry and academia. To alleviate this problem, a growing number of research projects focus on automated machine learning (AutoML). In this paper, we provide a comprehensive and up-to-date study on the state-of-the-art AutoML. First, we introduce the AutoML techniques in details according to the machine learning pipeline. Then we summarize existing Neural Architecture Search (NAS) research, which is one of the most popular topics in AutoML. We also compare the models generated by NAS algorithms with those human-designed models. Finally, we present several open problems for future research.
Time Series Classification (TSC) is an important and challenging problem in data mining. With the increase of time series data availability, hundreds of TSC algorithms have been proposed. Among these methods, only a few have considered Deep Neural Networks (DNNs) to perform this task. This is surprising as deep learning has seen very successful applications in the last years. DNNs have indeed revolutionized the field of computer vision especially with the advent of novel deeper architectures such as Residual and Convolutional Neural Networks. Apart from images, sequential data such as text and audio can also be processed with DNNs to reach state-of-the-art performance for document classification and speech recognition. In this article, we study the current state-of-the-art performance of deep learning algorithms for TSC by presenting an empirical study of the most recent DNN architectures for TSC. We give an overview of the most successful deep learning applications in various time series domains under a unified taxonomy of DNNs for TSC. We also provide an open source deep learning framework to the TSC community where we implemented each of the compared approaches and evaluated them on a univariate TSC benchmark (the UCR/UEA archive) and 12 multivariate time series datasets. By training 8,730 deep learning models on 97 time series datasets, we propose the most exhaustive study of DNNs for TSC to date.