亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Unit testing is critical to the software development process, ensuring the correctness of basic programming units in a program (e.g., a method). Search-based software testing (SBST) is an automated approach to generating test cases. SBST generates test cases with genetic algorithms by specifying the coverage criterion (e.g., branch coverage). However, a good test suite must have different properties, which cannot be captured using an individual coverage criterion. Therefore, the state-of-the-art approach combines multiple criteria to generate test cases. Since combining multiple coverage criteria brings multiple objectives for optimization, it hurts the test suites' coverage for certain criteria compared with using the single criterion. To cope with this problem, we propose a novel approach named \textbf{smart selection}. Based on the coverage correlations among criteria and the subsumption relationships among coverage goals, smart selection selects a subset of coverage goals to reduce the number of optimization objectives and avoid missing any properties of all criteria. We conduct experiments to evaluate smart selection on $400$ Java classes with three state-of-the-art genetic algorithms under the $2$-minute budget. On average, smart selection outperforms combining all goals on $65.1\%$ of the classes having significant differences between the two approaches. Secondly, we conduct experiments to verify our assumptions about coverage criteria relationships. Furthermore, we experiment with different budgets of $5$, $8$, and $10$ minutes, confirming the advantage of smart selection over combining all goals.

相關內容

Model-based reasoning is becoming increasingly common in software engineering. The process of building and analyzing models helps stakeholders to understand the ramifications of their software decisions. But complex models can confuse and overwhelm stakeholders when these models have too many candidate solutions. We argue here that a technique based on partial orderings lets humans find acceptable solutions via a binary chop needing $O(log(N))$ queries (or less). This paper checks the value of this approach via the iSNEAK partial ordering tool. Pre-experimentally, we were concerned that (a)~our automated methods might produce models that were unacceptable to humans; and that (b)~our human-in-the-loop methods might actual overlooking significant optimizations. Hence, we checked the acceptability of the solutions found by iSNEAK via a human-in-the-loop double-blind evaluation study of 20 Brazilian programmers. We also checked if iSNEAK misses significant optimizations (in a corpus of 16 SE models of size ranging up to 1000 attributes by comparing it against two rival technologies (the genetic algorithms preferred by the interactive search-based SE community; and the sequential model optimizers developed by the SE configuration community~\citep{flash_vivek}). iSNEAK 's solutions were found to be human acceptable (and those solutions took far less time to generate, with far fewer questions to any stakeholder). Significantly, our methods work well even for multi-objective models with competing goals (in this work we explore models with four to five goals). These results motivate more work on partial ordering for many-goal model-based problems.

Gaussian processes are used in many machine learning applications that rely on uncertainty quantification. Recently, computational tools for working with these models in geometric settings, such as when inputs lie on a Riemannian manifold, have been developed. This raises the question: can these intrinsic models be shown theoretically to lead to better performance, compared to simply embedding all relevant quantities into $\mathbb{R}^d$ and using the restriction of an ordinary Euclidean Gaussian process? To study this, we prove optimal contraction rates for intrinsic Mat\'ern Gaussian processes defined on compact Riemannian manifolds. We also prove analogous rates for extrinsic processes using trace and extension theorems between manifold and ambient Sobolev spaces: somewhat surprisingly, the rates obtained turn out to coincide with those of the intrinsic processes, provided that their smoothness parameters are matched appropriately. We illustrate these rates empirically on a number of examples, which, mirroring prior work, show that intrinsic processes can achieve better performance in practice. Therefore, our work shows that finer-grained analyses are needed to distinguish between different levels of data-efficiency of geometric Gaussian processes, particularly in settings which involve small data set sizes and non-asymptotic behavior.

Approximate computing is a promising approach to reduce the power, delay, and area in hardware design for many error-resilient applications such as machine learning (ML) and digital signal processing (DSP) systems, in which multipliers usually are key arithmetic units. Due to the underlying architectural differences between ASICs and FPGAs, existing ASIC-based approximate multipliers do not offer symmetrical gains when they are implemented by FPGA resources. In this paper, we propose AMG, an open-source automated approximate multiplier generator for FPGAs driven by Bayesian optimization (BO) with parallel evaluation. The proposed method simplifies the exact half adders (HAs) for the initial partial product (PP) compression in a multiplier while preserving coarse-grained additions for the following accumulation. The generated multipliers can be effectively mapped to lookup tables (LUTs) and carry chains provided by modern FPGAs, reducing hardware costs with acceptable errors. Compared with 1167 multipliers from previous works, our generated multipliers can form a Pareto front with 28.70%-38.47% improvements in terms of the product of hardware cost and error on average. All source codes, reproduced multipliers, and our generated multipliers are available at //github.com/phyzhenli/AMG.

Generative Artificial Intelligence (GenAI) tools have become increasingly prevalent in software development, offering assistance to various managerial and technical project activities. Notable examples of these tools include OpenAIs ChatGPT, GitHub Copilot, and Amazon CodeWhisperer. Although many recent publications have explored and evaluated the application of GenAI, a comprehensive understanding of the current development, applications, limitations, and open challenges remains unclear to many. Particularly, we do not have an overall picture of the current state of GenAI technology in practical software engineering usage scenarios. We conducted a literature review and focus groups for a duration of five months to develop a research agenda on GenAI for Software Engineering. We identified 78 open Research Questions (RQs) in 11 areas of Software Engineering. Our results show that it is possible to explore the adoption of GenAI in partial automation and support decision-making in all software development activities. While the current literature is skewed toward software implementation, quality assurance and software maintenance, other areas, such as requirements engineering, software design, and software engineering education, would need further research attention. Common considerations when implementing GenAI include industry-level assessment, dependability and accuracy, data accessibility, transparency, and sustainability aspects associated with the technology. GenAI is bringing significant changes to the field of software engineering. Nevertheless, the state of research on the topic still remains immature. We believe that this research agenda holds significance and practical value for informing both researchers and practitioners about current applications and guiding future research.

Analysing malware is important to understand how malicious software works and to develop appropriate detection and prevention methods. Dynamic analysis can overcome evasion techniques commonly used to bypass static analysis and provide insights into malware runtime activities. Much research on dynamic analysis focused on investigating machine-level information (e.g., CPU, memory, network usage) to identify whether a machine is running malicious activities. A malicious machine does not necessarily mean all running processes on the machine are also malicious. If we can isolate the malicious process instead of isolating the whole machine, we could kill the malicious process, and the machine can keep doing its job. Another challenge dynamic malware detection research faces is that the samples are executed in one machine without any background applications running. It is unrealistic as a computer typically runs many benign (background) applications when a malware incident happens. Our experiment with machine-level data shows that the existence of background applications decreases previous state-of-the-art accuracy by about 20.12% on average. We also proposed a process-level Recurrent Neural Network (RNN)-based detection model. Our proposed model performs better than the machine-level detection model; 0.049 increase in detection rate and a false-positive rate below 0.1.

The use of large language models for code generation is a rapidly growing trend in software development. However, without effective methods for ensuring the correctness of generated code, this trend could lead to any number of undesirable outcomes. In this paper, we lay out a vision for addressing this challenge: the Clover paradigm, short for Closed-Loop Verifiable Code Generation, which reduces correctness checking to the more accessible problem of consistency checking. At the core of Clover lies a checker that performs consistency checks among code, docstrings, and formal annotations. The checker is implemented using a novel integration of formal verification tools and large language models. We provide a theoretical analysis to support our thesis that Clover should be effective at consistency checking. We also empirically investigate its feasibility on a hand-designed dataset (CloverBench) featuring annotated Dafny programs at a textbook level of difficulty. Experimental results show that for this dataset, (i) LLMs are reasonably successful at automatically generating formal specifications; and (ii) our consistency checker achieves a promising acceptance rate (up to 87%) for correct instances while maintaining zero tolerance for incorrect ones (no false positives).

With the increasing complexity and scope of software systems, their dependability is crucial. The analysis of log data recorded during system execution can enable engineers to automatically predict failures at run time. Several Machine Learning (ML) techniques, including traditional ML and Deep Learning (DL), have been proposed to automate such tasks. However, current empirical studies are limited in terms of covering all main DL types -- Recurrent Neural Network (RNN), Convolutional Neural network (CNN), and transformer -- as well as examining them on a wide range of diverse datasets. In this paper, we aim to address these issues by systematically investigating the combination of log data embedding strategies and DL types for failure prediction. To that end, we propose a modular architecture to accommodate various configurations of embedding strategies and DL-based encoders. To further investigate how dataset characteristics such as dataset size and failure percentage affect model accuracy, we synthesised 360 datasets, with varying characteristics, for three distinct system behavioral models, based on a systematic and automated generation approach. Using the F1 score metric, our results show that the best overall performing configuration is a CNN-based encoder with Logkey2vec. Additionally, we provide specific dataset conditions, namely a dataset size >350 or a failure percentage >7.5%, under which this configuration demonstrates high accuracy for failure prediction.

Providing personalized assistance at scale is a long-standing challenge for computing educators, but a new generation of tools powered by large language models (LLMs) offers immense promise. Such tools can, in theory, provide on-demand help in large class settings and be configured with appropriate guardrails to prevent misuse and mitigate common concerns around learner over-reliance. However, the deployment of LLM-powered tools in authentic classroom settings is still rare, and very little is currently known about how students will use them in practice and what type of help they will seek. To address this, we examine students' use of an innovative LLM-powered tool that provides on-demand programming assistance without revealing solutions directly. We deployed the tool for 12 weeks in an introductory computer and data science course ($n = 52$), collecting more than 2,500 queries submitted by students throughout the term. We manually categorized all student queries based on the type of assistance sought, and we automatically analyzed several additional query characteristics. We found that most queries requested immediate help with programming assignments, whereas fewer requests asked for help on related concepts or for deepening conceptual understanding. Furthermore, students often provided minimal information to the tool, suggesting this is an area in which targeted instruction would be beneficial. We also found that students who achieved more success in the course tended to have used the tool more frequently overall. Lessons from this research can be leveraged by programming educators and institutions who plan to augment their teaching with emerging LLM-powered tools.

Object detection is considered as one of the most challenging problems in computer vision, since it requires correct prediction of both classes and locations of objects in images. In this study, we define a more difficult scenario, namely zero-shot object detection (ZSD) where no visual training data is available for some of the target object classes. We present a novel approach to tackle this ZSD problem, where a convex combination of embeddings are used in conjunction with a detection framework. For evaluation of ZSD methods, we propose a simple dataset constructed from Fashion-MNIST images and also a custom zero-shot split for the Pascal VOC detection challenge. The experimental results suggest that our method yields promising results for ZSD.

We introduce a generic framework that reduces the computational cost of object detection while retaining accuracy for scenarios where objects with varied sizes appear in high resolution images. Detection progresses in a coarse-to-fine manner, first on a down-sampled version of the image and then on a sequence of higher resolution regions identified as likely to improve the detection accuracy. Built upon reinforcement learning, our approach consists of a model (R-net) that uses coarse detection results to predict the potential accuracy gain for analyzing a region at a higher resolution and another model (Q-net) that sequentially selects regions to zoom in. Experiments on the Caltech Pedestrians dataset show that our approach reduces the number of processed pixels by over 50% without a drop in detection accuracy. The merits of our approach become more significant on a high resolution test set collected from YFCC100M dataset, where our approach maintains high detection performance while reducing the number of processed pixels by about 70% and the detection time by over 50%.

北京阿比特科技有限公司