亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

As software becomes increasingly complex and prone to vulnerabilities, automated vulnerability detection is critically important, yet challenging. Given the significant successes of large language models (LLMs) in various tasks, there is growing anticipation of their efficacy in vulnerability detection. However, a quantitative understanding of their potential in vulnerability detection is still missing. To bridge this gap, we introduce a comprehensive vulnerability benchmark VulBench. This benchmark aggregates high-quality data from a wide range of CTF (Capture-the-Flag) challenges and real-world applications, with annotations for each vulnerable function detailing the vulnerability type and its root cause. Through our experiments encompassing 16 LLMs and 6 state-of-the-art (SOTA) deep learning-based models and static analyzers, we find that several LLMs outperform traditional deep learning approaches in vulnerability detection, revealing an untapped potential in LLMs. This work contributes to the understanding and utilization of LLMs for enhanced software security.

相關內容

大語言模型是基于海量文本數據訓練的深度學習模型。它不僅能夠生成自然語言文本,還能夠深入理解文本含義,處理各種自然語言任務,如文本摘要、問答、翻譯等。2023年,大語言模型及其在人工智能領域的應用已成為全球科技研究的熱點,其在規模上的增長尤為引人注目,參數量已從最初的十幾億躍升到如今的一萬億。參數量的提升使得模型能夠更加精細地捕捉人類語言微妙之處,更加深入地理解人類語言的復雜性。在過去的一年里,大語言模型在吸納新知識、分解復雜任務以及圖文對齊等多方面都有顯著提升。隨著技術的不斷成熟,它將不斷拓展其應用范圍,為人類提供更加智能化和個性化的服務,進一步改善人們的生活和生產方式。

Correctness properties are critical to conducting verification and validation on software systems, especially those cyberphysical systems whose functionality changes frequently due to software updates, changes in the operating environment, or newly learned behaviors. We detail a novel method to automatically construct expressive, executable correctness properties in the form of machine-learned correctness properties which can be used to ensure that a system's behavior is correct with respect to its design and operating requirements. We propose a method to bootstrap the creation of these correctness properties using a novel simulation-based generation of training and testing data using multiple extensions to the Cross Entropy algorithm for search-based optimization. Then, we apply this method to a software-in-the-loop evaluation of an autonomous vehicle to demonstrate that such models can assert about important properties of multi-agent cyberphysical systems. We demonstrate that this process brings the task of developing robust correctness properties from the realm of formal methods experts into the domain of system developers and engineers, and that machine-learned correctness properties are expressive enough to capture the correct behavior of cyberphysical systems in their complex environments. This advancement can provide evidence of dependability to system designers and users, enhancing trust in the deployment of autonomous vehicles and other intelligent transportation systems.

In mixed-initiative conversational search systems, clarifying questions are used to help users who struggle to express their intentions in a single query. These questions aim to uncover user's information needs and resolve query ambiguities. We hypothesize that in scenarios where multimodal information is pertinent, the clarification process can be improved by using non-textual information. Therefore, we propose to add images to clarifying questions and formulate the novel task of asking multimodal clarifying questions in open-domain, mixed-initiative conversational search systems. To facilitate research into this task, we collect a dataset named Melon that contains over 4k multimodal clarifying questions, enriched with over 14k images. We also propose a multimodal query clarification model named Marto and adopt a prompt-based, generative fine-tuning strategy to perform the training of different stages with different prompts. Several analyses are conducted to understand the importance of multimodal contents during the query clarification phase. Experimental results indicate that the addition of images leads to significant improvements of up to 90% in retrieval performance when selecting the relevant images. Extensive analyses are also performed to show the superiority of Marto compared with discriminative baselines in terms of effectiveness and efficiency.

Conventional distributed approaches to coverage control may suffer from lack of convergence and poor performance, due to the fact that agents have limited information, especially in non-convex discrete environments. To address this issue, we extend the approach of [Marden 2016] which demonstrates how a limited degree of inter-agent communication can be exploited to overcome such pitfalls in one-dimensional discrete environments. The focus of this paper is on extending such results to general dimensional settings. We show that the extension is convergent and keeps the approximation ratio of 2, meaning that any stable solution is guaranteed to have a performance within 50% of the optimal one. The experimental results exhibit that our algorithm outperforms several state-of-the-art algorithms, and also that the runtime is scalable.

In computer science education, test cases are an integral part of programming assignments since they can be used as assessment items to test students' programming knowledge and provide personalized feedback on student-written code. The goal of our work is to propose a fully automated approach for test case generation that can accurately measure student knowledge, which is important for two reasons. First, manually constructing test cases requires expert knowledge and is a labor-intensive process. Second, developing test cases for students, especially those who are novice programmers, is significantly different from those oriented toward professional-level software developers. Therefore, we need an automated process for test case generation to assess student knowledge and provide feedback. In this work, we propose a large language model-based approach to automatically generate test cases and show that they are good measures of student knowledge, using a publicly available dataset that contains student-written Java code. We also discuss future research directions centered on using test cases to help students.

Writing declarative models has numerous benefits, ranging from automated reasoning and correction of design-level properties before systems are built, to automated testing and debugging of their implementations after they are built. Alloy is a declarative modeling language that is well-suited for verifying system designs. A key strength of Alloy is its scenario-finding toolset, the Analyzer, which allows users to explore all valid scenarios that adhere to the model's constraints up to a user-provided scope. However, even with visualized scenarios, it is difficult to write correct Alloy models. To address this, a growing body of work explores different techniques for debugging Alloy models. In order to develop and evaluate these techniques in an effective manor, this paper presents an empirical study of over 97,000 models written by novice users trying to learn Alloy. We investigate how users write both correct and incorrect models in order to produce a comprehensive benchmark for future use as well as a series of observations to guide debugging and educational efforts for Alloy model development.

Language models (LMs) have already demonstrated remarkable abilities in understanding and generating both natural and formal language. Despite these advances, their integration with real-world environments such as large-scale knowledge bases (KBs) remains an underdeveloped area, affecting applications such as semantic parsing and indulging in "hallucinated" information. This paper is an experimental investigation aimed at uncovering the robustness challenges that LMs encounter when tasked with knowledge base question answering (KBQA). The investigation covers scenarios with inconsistent data distribution between training and inference, such as generalization to unseen domains, adaptation to various language variations, and transferability across different datasets. Our comprehensive experiments reveal that even when employed with our proposed data augmentation techniques, advanced small and large language models exhibit poor performance in various dimensions. While the LM is a promising technology, the robustness of the current form in dealing with complex environments is fragile and of limited practicality because of the data distribution issue. This calls for future research on data collection and LM learning paradims.

Soft robots have immense potential given their inherent safety and adaptability, but challenges in soft actuator forces and design constraints have limited scaling up soft robots to larger sizes. Electrothermal shape memory alloy (SMA) artificial muscles have the potential to create these large forces and high displacements, but consistently using these muscles under a well-defined model, in-situ in a soft robot, remains an open challenge. This article provides a system for maintaining the highest-possible consistent SMA forces, over long lifetimes, by combining a fatigue testing protocol with a supervisory control system for the muscles' internal temperature state. We propose a design of a soft limb with swap-able SMA muscles, and deploy the limb in a blocked-force test to quantify the relationship between the measured maximum force at different temperatures over different lifetimes. Then, by applying an invariance-based control system to maintain temperatures under our long-life limit, we demonstrate consistent high forces in a practical task over hundreds of cycles. The method we developed allows for practical implementation of SMAs in soft robots through characterizing and controlling their behavior in-situ, and provides a method to impose limits that maximize their consistent, repeatable behavior.

The key challenge of image manipulation detection is how to learn generalizable features that are sensitive to manipulations in novel data, whilst specific to prevent false alarms on authentic images. Current research emphasizes the sensitivity, with the specificity overlooked. In this paper we address both aspects by multi-view feature learning and multi-scale supervision. By exploiting noise distribution and boundary artifact surrounding tampered regions, the former aims to learn semantic-agnostic and thus more generalizable features. The latter allows us to learn from authentic images which are nontrivial to be taken into account by current semantic segmentation network based methods. Our thoughts are realized by a new network which we term MVSS-Net. Extensive experiments on five benchmark sets justify the viability of MVSS-Net for both pixel-level and image-level manipulation detection.

Aspect based sentiment analysis (ABSA) can provide more detailed information than general sentiment analysis, because it aims to predict the sentiment polarities of the given aspects or entities in text. We summarize previous approaches into two subtasks: aspect-category sentiment analysis (ACSA) and aspect-term sentiment analysis (ATSA). Most previous approaches employ long short-term memory and attention mechanisms to predict the sentiment polarity of the concerned targets, which are often complicated and need more training time. We propose a model based on convolutional neural networks and gating mechanisms, which is more accurate and efficient. First, the novel Gated Tanh-ReLU Units can selectively output the sentiment features according to the given aspect or entity. The architecture is much simpler than attention layer used in the existing models. Second, the computations of our model could be easily parallelized during training, because convolutional layers do not have time dependency as in LSTM layers, and gating units also work independently. The experiments on SemEval datasets demonstrate the efficiency and effectiveness of our models.

Object detection is considered as one of the most challenging problems in computer vision, since it requires correct prediction of both classes and locations of objects in images. In this study, we define a more difficult scenario, namely zero-shot object detection (ZSD) where no visual training data is available for some of the target object classes. We present a novel approach to tackle this ZSD problem, where a convex combination of embeddings are used in conjunction with a detection framework. For evaluation of ZSD methods, we propose a simple dataset constructed from Fashion-MNIST images and also a custom zero-shot split for the Pascal VOC detection challenge. The experimental results suggest that our method yields promising results for ZSD.

北京阿比特科技有限公司