东京热加勒比中文无码,甜味弥漫一区二区在线观看,无码人妻综合精品一区色欲AV

Software vulnerability detection is critical in software security because it identifies potential bugs in software systems, enabling immediate remediation and mitigation measures to be implemented before they may be exploited. Automatic vulnerability identification is important because it can evaluate large codebases more efficiently than manual code auditing. Many Machine Learning (ML) and Deep Learning (DL) based models for detecting vulnerabilities in source code have been presented in recent years. However, a survey that summarises, classifies, and analyses the application of ML/DL models for vulnerability detection is missing. It may be difficult to discover gaps in existing research and potential for future improvement without a comprehensive survey. This could result in essential areas of research being overlooked or under-represented, leading to a skewed understanding of the state of the art in vulnerability detection. This work address that gap by presenting a systematic survey to characterize various features of ML/DL-based source code level software vulnerability detection approaches via five primary research questions (RQs). Specifically, our RQ1 examines the trend of publications that leverage ML/DL for vulnerability detection, including the evolution of research and the distribution of publication venues. RQ2 describes vulnerability datasets used by existing ML/DL-based models, including their sources, types, and representations, as well as analyses of the embedding techniques used by these approaches. RQ3 explores the model architectures and design assumptions of ML/DL-based vulnerability detection approaches. RQ4 summarises the type and frequency of vulnerabilities that are covered by existing studies. Lastly, RQ5 presents a list of current challenges to be researched and an outline of a potential research roadmap that highlights crucial opportunities for future work.

相關內容

Learning

關注 12

CASES · 可理解性 · Analysis · 動量 · Engineering ·

2023 年 8 月 13 日

MASC: A Tool for Mutation-Based Evaluation of Static Crypto-API Misuse Detectors

Amit Seal Ami,Syed Yusuf Ahmed,Radowan Mahmud Redoy,Nathan Cooper,Kaushal Kafle,Kevin Moran,Denys Poshyvanyk,Adwait Nadkarni

from arxiv, To be published in Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

While software engineers are optimistically adopting crypto-API misuse detectors (or crypto-detectors) in their software development cycles, this momentum must be accompanied by a rigorous understanding of crypto-detectors' effectiveness at finding crypto-API misuses in practice. This demo paper presents the technical details and usage scenarios of our tool, namely Mutation Analysis for evaluating Static Crypto-API misuse detectors (MASC). We developed $12$ generalizable, usage based mutation operators and three mutation scopes, namely Main Scope, Similarity Scope, and Exhaustive Scope, which can be used to expressively instantiate compilable variants of the crypto-API misuse cases. Using MASC, we evaluated nine major crypto-detectors, and discovered $19$ unique, undocumented flaws. We designed MASC to be configurable and user-friendly; a user can configure the parameters to change the nature of generated mutations. Furthermore, MASC comes with both Command Line Interface and Web-based front-end, making it practical for users of different levels of expertise.

Analysis · INFORMS · 可辨認的 · TOOLS · 代碼 ·

2023 年 8 月 11 日

A Uniform Representation of Classical and Quantum Source Code for Static Code Analysis

Maximilian Kaul,Alexander Küchler,Christian Banse

from arxiv, 2023 IEEE International Conference on Quantum Computing and Engineering (QCE)

The emergence of quantum computing raises the question of how to identify (security-relevant) programming errors during development. However, current static code analysis tools fail to model information specific to quantum computing. In this paper, we identify this information and propose to extend classical code analysis tools accordingly. Among such tools, we identify the Code Property Graph to be very well suited for this task as it can be easily extended with quantum computing specific information. For our proof of concept, we implemented a tool which includes information from the quantum world in the graph and demonstrate its ability to analyze source code written in Qiskit and OpenQASM. Our tool brings together the information from the classical and quantum world, enabling analysis across both domains. By combining all relevant information into a single detailed analysis, this powerful tool can facilitate tackling future quantum source code analysis challenges.

Automator · Learning · 自適應學習 · MoDELS · 泛函 ·

2023 年 8 月 10 日

Testing Updated Apps by Adapting Learned Models

Chanh-Duc Ngo,Fabrizio Pastore,Lionel Briand

Although App updates are frequent and software engineers would like to verify updated features only, automated testing techniques verify entire Apps and are thus wasting resources. We present Continuous Adaptation of Learned Models (CALM), an automated App testing approach that efficiently tests App updates by adapting App models learned when automatically testing previous App versions. CALM focuses on functional testing. Since functional correctness can be mainly verified through the visual inspection of App screens, CALM minimizes the number of App screens to be visualized by software testers while maximizing the percentage of updated methods and instructions exercised. Our empirical evaluation shows that CALM exercises a significantly higher proportion of updated methods and instructions than six state-of-the-art approaches, for the same maximum number of App screens to be visually inspected. Further, in common update scenarios, where only a small fraction of methods are updated, CALM is even quicker to outperform all competing approaches in a more significant way.

Processing（編程語言） · MoDELS · Engineering · ML · SPIRAL ·

2023 年 8 月 10 日

Application of Systems Engineering Process in Building ML-Enabled Systems

Jie JW Wu

Machine learning (ML) components are being added to more and more critical and impactful software systems, but the software development process of real-world production systems from prototyped ML models remains challenging with additional complexity and interdisciplinary collaboration challenges. This poses difficulties in using traditional software lifecycle models such as waterfall, spiral or agile model when building ML-enabled systems. By interviewing with practitioners from multiple companies, we investigated the application of using systems engineering process in ML-enabled systems. We developed a set of propositions and proposed V4ML process model for building products with ML components. We found that V4ML process model requires more efforts on documentation, system decomposition and V&V, but it addressed the interdisciplinary collaboration challenges and additional complexity introduced by ML components.

知識 (knowledge) · Performer · 圖 · 結點 · 過濾式方法 ·

2023 年 8 月 9 日

A Bipartite Graph is All We Need for Enhancing Emotional Reasoning with Commonsense Knowledge

Kailai Yang,Tianlin Zhang,Shaoxiong Ji,Sophia Ananiadou

from arxiv, Accepted by CIKM 2023 as a long paper

The context-aware emotional reasoning ability of AI systems, especially in conversations, is of vital importance in applications such as online opinion mining from social media and empathetic dialogue systems. Due to the implicit nature of conveying emotions in many scenarios, commonsense knowledge is widely utilized to enrich utterance semantics and enhance conversation modeling. However, most previous knowledge infusion methods perform empirical knowledge filtering and design highly customized architectures for knowledge interaction with the utterances, which can discard useful knowledge aspects and limit their generalizability to different knowledge sources. Based on these observations, we propose a Bipartite Heterogeneous Graph (BHG) method for enhancing emotional reasoning with commonsense knowledge. In BHG, the extracted context-aware utterance representations and knowledge representations are modeled as heterogeneous nodes. Two more knowledge aggregation node types are proposed to perform automatic knowledge filtering and interaction. BHG-based knowledge infusion can be directly generalized to multi-type and multi-grained knowledge sources. In addition, we propose a Multi-dimensional Heterogeneous Graph Transformer (MHGT) to perform graph reasoning, which can retain unchanged feature spaces and unequal dimensions for heterogeneous node types during inference to prevent unnecessary loss of information. Experiments show that BHG-based methods significantly outperform state-of-the-art knowledge infusion methods and show generalized knowledge infusion ability with higher efficiency. Further analysis proves that previous empirical knowledge filtering methods do not guarantee to provide the most useful knowledge information. Our code is available at: //github.com/SteveKGYang/BHG.

數據集 · Learning · 有向 · MoDELS · 泛化理論 ·

2023 年 8 月 9 日

DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection

Yizheng Chen,Zhoujie Ding,Lamya Alowain,Xinyun Chen,David Wagner

from arxiv, Published at RAID 2023

We propose and release a new vulnerable source code dataset. We curate the dataset by crawling security issue websites, extracting vulnerability-fixing commits and source codes from the corresponding projects. Our new dataset contains 18,945 vulnerable functions spanning 150 CWEs and 330,492 non-vulnerable functions extracted from 7,514 commits. Our dataset covers 295 more projects than all previous datasets combined. Combining our new dataset with previous datasets, we present an analysis of the challenges and promising research directions of using deep learning for detecting software vulnerabilities. We study 11 model architectures belonging to 4 families. Our results show that deep learning is still not ready for vulnerability detection, due to high false positive rate, low F1 score, and difficulty of detecting hard CWEs. In particular, we demonstrate an important generalization challenge for the deployment of deep learning-based models. We show that increasing the volume of training data may not further improve the performance of deep learning models for vulnerability detection, but might be useful to improve the generalization ability to unseen projects. We also identify hopeful future research directions. We demonstrate that large language models (LLMs) are a promising research direction for ML-based vulnerability detection, outperforming Graph Neural Networks (GNNs) with code-structure features in our experiments. Moreover, developing source code specific pre-training objectives is a promising research direction to improve the vulnerability detection performance.

Analysis · 代碼 · 可理解性 · Projection · 得分 ·

2023 年 8 月 9 日

A Quantitative Analysis of Open Source Software Code Quality: Insights from Metric Distributions

Siyuan Jin,Mianmian Zhang,Yekai Guo,Yuejiang He,Ziyuan Li,Bichao Chen,Bing Zhu,Yong Xia

Code quality is a crucial construct in open-source software (OSS) with three dimensions: maintainability, reliability, and functionality. To accurately measure them, we divide 20 distinct metrics into two types: 1) threshold-type metrics that influence code quality in a monotonic manner; 2) non-threshold-type metrics that lack a monotonic relationship to evaluate. We propose a distribution-based method to provide scores for metrics, which demonstrates great explainability on OSS adoption. Our empirical analysis includes more than 36,460 OSS projects and their raw metrics from SonarQube and CK. Our work contributes to the understanding of the multi-dimensional construct of code quality and its metric measurements.

Learning · 可辨認的 · bulk · Machine Learning · MoDELS ·

2023 年 8 月 8 日

Different Mechanisms of Machine Learning and Optimization Algorithms Utilized in Intrusion Detection Systems

Mohammad Aziz,Ali Saeed Alfoudi

Malicious software is an integral part of cybercrime defense. Due to the growing number of malicious attacks and their target sources, detecting and preventing the attack becomes more challenging due to the assault's changing behavior. The bulk of classic malware detection systems is based on statistics, analytic techniques, or machine learning. Virus signature methods are widely used to identify malware. The bulk of anti-malware systems categorizes malware using regular expressions and patterns. While antivirus software is less likely to update its databases to identify and block malware, file features must be updated to detect and prevent newly generated malware. Creating attack signatures requires practically all of a human being's work. The purpose of this study is to undertake a review of the current research on intrusion detection models and the datasets that support them. In this article, we discuss the state-of-the-art, focusing on the strategy that was devised and executed, the dataset that was utilized, the findings, and the assessment that was undertaken. Additionally, the surveyed articles undergo critical analysis and statements in order to give a thorough comparative review. Machine learning and deep learning methods, as well as new classification and feature selection methodologies, are studied and researched. Thus far, each technique has proved the capability of constructing very accurate intrusion detection models. The survey findings reveal that Clearly, the MultiTree and adaptive voting algorithms surpassed all other models in terms of persistency and performance, averaging 99.98 percent accuracy on average.

Next · Integration · 有向 · 控制器 · Continuity ·

2022 年 3 月 5 日

AI for Next Generation Computing: Emerging Trends and Future Directions

Sukhpal Singh Gill,Minxian Xu,Carlo Ottaviani,Panos Patros,Rami Bahsoon,Arash Shaghaghi,Muhammed Golec,Vlado Stankovski,Huaming Wu,Ajith Abraham,Manmeet Singh,Harshit Mehta,Soumya K. Ghosh,Thar Baker,Ajith Kumar Parlikad,Hanan Lutfiyya,Salil S. Kanhere,Rizos Sakellariou,Schahram Dustdar,Omer Rana,Ivona Brandic,Steve Uhlig

from arxiv, Accepted for Publication in Elsevier IoT Journal, 2022

Autonomic computing investigates how systems can achieve (user) specified control outcomes on their own, without the intervention of a human operator. Autonomic computing fundamentals have been substantially influenced by those of control theory for closed and open-loop systems. In practice, complex systems may exhibit a number of concurrent and inter-dependent control loops. Despite research into autonomic models for managing computer resources, ranging from individual resources (e.g., web servers) to a resource ensemble (e.g., multiple resources within a data center), research into integrating Artificial Intelligence (AI) and Machine Learning (ML) to improve resource autonomy and performance at scale continues to be a fundamental challenge. The integration of AI/ML to achieve such autonomic and self-management of systems can be achieved at different levels of granularity, from full to human-in-the-loop automation. In this article, leading academics, researchers, practitioners, engineers, and scientists in the fields of cloud computing, AI/ML, and quantum computing join to discuss current research and potential future directions for these fields. Further, we discuss challenges and opportunities for leveraging AI and ML in next generation computing for emerging computing paradigms, including cloud, fog, edge, serverless and quantum computing environments.

學成 · 表示學習 · MoDELS · CASES · contrastive ·

2021 年 6 月 3 日

Improving Event Causality Identification via Self-Supervised Representation Learning on External Causal Statement

Xinyu Zuo,Pengfei Cao,Yubo Chen,Kang Liu,Jun Zhao,Weihua Peng,Yuguang Chen

from arxiv, Accepted to Findings of ACL 2021

Current models for event causality identification (ECI) mainly adopt a supervised framework, which heavily rely on labeled data for training. Unfortunately, the scale of current annotated datasets is relatively limited, which cannot provide sufficient support for models to capture useful indicators from causal statements, especially for handing those new, unseen cases. To alleviate this problem, we propose a novel approach, shortly named CauSeRL, which leverages external causal statements for event causality identification. First of all, we design a self-supervised framework to learn context-specific causal patterns from external causal statements. Then, we adopt a contrastive transfer strategy to incorporate the learned context-specific causal patterns into the target ECI model. Experimental results show that our method significantly outperforms previous methods on EventStoryLine and Causal-TimeBank (+2.0 and +3.4 points on F1 value respectively).