成人艳情一二三区按摩,亚洲色大成人WWW,久久久久夜色精品波多野结衣,国产放荡对白视频网络,亚洲欧美激情国产一区二区

Real-world software applications must constantly evolve to remain relevant. This evolution occurs when developing new applications or adapting existing ones to meet new requirements, make corrections, or incorporate future functionality. Traditional methods of software quality control involve software quality models and continuous code inspection tools. These measures focus on directly assessing the quality of the software. However, there is a strong correlation and causation between the quality of the development process and the resulting software product. Therefore, improving the development process indirectly improves the software product, too. To achieve this, effective learning from past processes is necessary, often embraced through post mortem organizational learning. While qualitative evaluation of large artifacts is common, smaller quantitative changes captured by application lifecycle management are often overlooked. In addition to software metrics, these smaller changes can reveal complex phenomena related to project culture and management. Leveraging these changes can help detect and address such complex issues. Software evolution was previously measured by the size of changes, but the lack of consensus on a reliable and versatile quantification method prevents its use as a dependable metric. Different size classifications fail to reliably describe the nature of evolution. While application lifecycle management data is rich, identifying which artifacts can model detrimental managerial practices remains uncertain. Approaches such as simulation modeling, discrete events simulation, or Bayesian networks have only limited ability to exploit continuous-time process models of such phenomena. Even worse, the accessibility and mechanistic insight into such gray- or black-box models are typically very low. To address these challenges, we suggest leveraging objectively [...]

相關內容

Processing（編程語言）

關注 121

Processing 是(shi)一門開(kai)源編程(cheng)語言和與之(zhi)配套(tao)的(de)集成開(kai)發環境（IDE）的(de)名(ming)稱。Processing 在(zai)電子藝術(shu)和視覺設計社區(qu)被(bei)用來教授編程(cheng)基(ji)礎，并運用于大(da)量的(de)新媒體(ti)和互動(dong)藝術(shu)作品中。

CASES · 可理解性 · Analysis · 動量 · Engineering ·

2023 年 8 月 13 日

MASC: A Tool for Mutation-Based Evaluation of Static Crypto-API Misuse Detectors

Amit Seal Ami,Syed Yusuf Ahmed,Radowan Mahmud Redoy,Nathan Cooper,Kaushal Kafle,Kevin Moran,Denys Poshyvanyk,Adwait Nadkarni

from arxiv, To be published in Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

While software engineers are optimistically adopting crypto-API misuse detectors (or crypto-detectors) in their software development cycles, this momentum must be accompanied by a rigorous understanding of crypto-detectors' effectiveness at finding crypto-API misuses in practice. This demo paper presents the technical details and usage scenarios of our tool, namely Mutation Analysis for evaluating Static Crypto-API misuse detectors (MASC). We developed $12$ generalizable, usage based mutation operators and three mutation scopes, namely Main Scope, Similarity Scope, and Exhaustive Scope, which can be used to expressively instantiate compilable variants of the crypto-API misuse cases. Using MASC, we evaluated nine major crypto-detectors, and discovered $19$ unique, undocumented flaws. We designed MASC to be configurable and user-friendly; a user can configure the parameters to change the nature of generated mutations. Furthermore, MASC comes with both Command Line Interface and Web-based front-end, making it practical for users of different levels of expertise.

正則化項 · 離散化 · 分解的 · SPE · Networks ·

2023 年 8 月 11 日

A Dynamics Theory of Implicit Regularization in Deep Low-Rank Matrix Factorization

Jian Cao,Chen Qian,Yihui Huang,Dicheng Chen,Yuncheng Gao,Jiyang Dong,Di Guo,Xiaobo Qu

from arxiv, 15 pages, 8 figures

Implicit regularization is an important way to interpret neural networks. Recent theory starts to explain implicit regularization with the model of deep matrix factorization (DMF) and analyze the trajectory of discrete gradient dynamics in the optimization process. These discrete gradient dynamics are relatively small but not infinitesimal, thus fitting well with the practical implementation of neural networks. Currently, discrete gradient dynamics analysis has been successfully applied to shallow networks but encounters the difficulty of complex computation for deep networks. In this work, we introduce another discrete gradient dynamics approach to explain implicit regularization, i.e. landscape analysis. It mainly focuses on gradient regions, such as saddle points and local minima. We theoretically establish the connection between saddle point escaping (SPE) stages and the matrix rank in DMF. We prove that, for a rank-R matrix reconstruction, DMF will converge to a second-order critical point after R stages of SPE. This conclusion is further experimentally verified on a low-rank matrix reconstruction problem. This work provides a new theory to analyze implicit regularization in deep learning.

可約的 · 降維 · 優化器 · Python · INFORMS ·

2023 年 8 月 11 日

ZADU: A Python Library for Evaluating the Reliability of Dimensionality Reduction Embeddings

Hyeon Jeon,Aeri Cho,Jinhwa Jang,Soohyun Lee,Jake Hyun,Hyung-Kwon Ko,Jaemin Jo,Jinwook Seo

from arxiv, 2023 IEEE Visualization and Visual Analytics (IEEE VIS 2023) Short paper

Dimensionality reduction (DR) techniques inherently distort the original structure of input high-dimensional data, producing imperfect low-dimensional embeddings. Diverse distortion measures have thus been proposed to evaluate the reliability of DR embeddings. However, implementing and executing distortion measures in practice has so far been time-consuming and tedious. To address this issue, we present ZADU, a Python library that provides distortion measures. ZADU is not only easy to install and execute but also enables comprehensive evaluation of DR embeddings through three key features. First, the library covers a wide range of distortion measures. Second, it automatically optimizes the execution of distortion measures, substantially reducing the running time required to execute multiple measures. Last, the library informs how individual points contribute to the overall distortions, facilitating the detailed analysis of DR embeddings. By simulating a real-world scenario of optimizing DR embeddings, we verify that our optimization scheme substantially reduces the time required to execute distortion measures. Finally, as an application of ZADU, we present another library called ZADUVis that allows users to easily create distortion visualizations that depict the extent to which each region of an embedding suffers from distortions.

Processing（編程語言） · MoDELS · Engineering · ML · SPIRAL ·

2023 年 8 月 11 日

Application of Systems Engineering Process in Building ML-Enabled Systems

Jie JW Wu

Machine learning (ML) components are being added to more and more critical and impactful software systems, but the software development process of real-world production systems from prototyped ML models remains challenging with additional complexity and interdisciplinary collaboration challenges. This poses difficulties in using traditional software lifecycle models such as waterfall, spiral or agile model when building ML-enabled systems. By interviewing with practitioners from multiple companies, we investigated the application of using systems engineering process in ML-enabled systems. We developed a set of propositions and proposed V4ML process model for building products with ML components. We found that V4ML process model requires more efforts on documentation, system decomposition and V&V, but it addressed the interdisciplinary collaboration challenges and additional complexity introduced by ML components.

NMT · Performer · Machine Translation · Learning · 代碼 ·

2023 年 8 月 9 日

Evaluating and Optimizing the Effectiveness of Neural Machine Translation in Supporting Code Retrieval Models: A Study on the CAT Benchmark

Hung Phan,Ali Jannesari

from arxiv, Accepted as Full Paper in Proceedings of the 32nd ACM International Conference on Information and Knowledge Management (CIKM), Birmingham, UK, October 2023

Neural Machine Translation (NMT) is widely applied in software engineering tasks. The effectiveness of NMT for code retrieval relies on the ability to learn from the sequence of tokens in the source language to the sequence of tokens in the target language. While NMT performs well in pseudocode-to-code translation, it might have challenges in learning to translate from natural language query to source code in newly curated real-world code documentation/ implementation datasets. In this work, we analyze the performance of NMT in natural language-to-code translation in the newly curated CAT benchmark that includes the optimized versions of three Java datasets TLCodeSum, CodeSearchNet, Funcom, and a Python dataset PCSD. Our evaluation shows that NMT has low accuracy, measured by CrystalBLEU and Meteor metrics in this task. To alleviate the duty of NMT in learning complex representation of source code, we propose ASTTrans Representation, a tailored representation of an Abstract Syntax Tree (AST) using a subset of non-terminal nodes. We show that the classical approach NMT performs significantly better in learning ASTTrans Representation over code tokens with up to 36% improvement on Meteor score. Moreover, we leverage ASTTrans Representation to conduct combined code search processes from the state-of-the-art code search processes using GraphCodeBERT and UniXcoder. Our NMT models of learning ASTTrans Representation can boost the Mean Reciprocal Rank of these state-of-the-art code search processes by up to 3.08% and improve 23.08% of queries' results over the CAT benchmark.

binary · 容差 · CASE · 優化器 · Ad hoc ·

2023 年 8 月 9 日

Multi-Valued Connected Consensus: A New Perspective on Crusader Agreement and Adopt-Commit

Hagit Attiya,Jennifer L. Welch

from arxiv, 38 pages, 5 figures

Algorithms to solve fault-tolerant consensus in asynchronous systems often rely on primitives such as crusader agreement, adopt-commit, and graded broadcast, which provide weaker agreement properties than consensus. Although these primitives have a similar flavor, they have been defined and implemented separately in ad hoc ways. We propose a new problem called connected consensus that has as special cases crusader agreement, adopt-commit, and graded broadcast, and generalizes them to handle multi-valued inputs. The generalization is accomplished by relating the problem to approximate agreement on graphs. We present three algorithms for multi-valued connected consensus in asynchronous message-passing systems, one tolerating crash failures and two tolerating malicious (unauthenticated Byzantine) failures. We extend the definition of binding, a desirable property recently identified as supporting binary consensus algorithms that are correct against adaptive adversaries, to the multi-valued input case and show that all our algorithms satisfy the property. Our crash-resilient algorithm has failure-resilience and time complexity that we show are optimal. When restricted to the case of binary inputs, the algorithm has improved time complexity over prior algorithms. Our two algorithms for malicious failures trade off failure resilience and time complexity. The first algorithm has time complexity that we prove is optimal but worse failure-resilience, while the second has failure-resilience that we prove is optimal but worse time complexity. When restricted to the case of binary inputs, the time complexity (as well as resilience) of the second algorithm matches that of prior algorithms.

模型評估 · Learning · 表示 · 特征選擇 · 穩健性 ·

2023 年 8 月 8 日

EPS: Distinguishable IQ Data Representation for Domain-Adaptation Learning of Device Fingerprints

Abdurrahman Elmaghbub,Bechir Hamdaoui

Deep learning (DL)-based RF fingerprinting (RFFP) technology has emerged as a powerful physical-layer security mechanism, enabling device identification and authentication based on unique device-specific signatures that can be extracted from the received RF signals. However, DL-based RFFP methods face major challenges concerning their ability to adapt to domain (e.g., day/time, location, channel, etc.) changes and variability. This work proposes a novel IQ data representation and feature design, termed Double-Sided Envelope Power Spectrum or EPS, that is proven to overcome the domain adaptation problems significantly. By accurately capturing device hardware impairments while suppressing irrelevant domain information, EPS offers improved feature selection for DL models in RFFP. Experimental evaluations demonstrate its effectiveness, achieving over 99% testing accuracy in same-day/channel/location evaluations and 93% accuracy in cross-day evaluations, outperforming the traditional IQ representation. Additionally, EPS excels in cross-location evaluations, achieving a 95% accuracy. The proposed representation significantly enhances the robustness and generalizability of DL-based RFFP methods, thereby presenting a transformative solution to IQ data-based device fingerprinting.

數據增強 · Processing（編程語言） · Better · 多樣性 · 測試數據 ·

2021 年 10 月 5 日

Data Augmentation Approaches in Natural Language Processing: A Survey

Bohan Li,Yutai Hou,Wanxiang Che

As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where deep learning techniques may fail. It is widely applied in computer vision then introduced to natural language processing and achieves improvements in many tasks. One of the main focuses of the DA methods is to improve the diversity of training data, thereby helping the model to better generalize to unseen testing data. In this survey, we frame DA methods into three categories based on the diversity of augmented data, including paraphrasing, noising, and sampling. Our paper sets out to analyze DA methods in detail according to the above categories. Further, we also introduce their applications in NLP tasks as well as the challenges.

任務對話系統 · 得分 · Better · 估計/估計量 · 相關系數 ·

2019 年 11 月 4 日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Sarik Ghazarian,Ralph Weischedel,Aram Galstyan,Nanyun Peng

User engagement is a critical metric for evaluating the quality of open-domain dialogue systems. Prior work has focused on conversation-level engagement by using heuristically constructed features such as the number of turns and the total time of the conversation. In this paper, we investigate the possibility and efficacy of estimating utterance-level engagement and define a novel metric, {\em predictive engagement}, for automatic evaluation of open-domain dialogue systems. Our experiments demonstrate that (1) human annotators have high agreement on assessing utterance-level engagement scores; (2) conversation-level engagement scores can be predicted from properly aggregated utterance-level engagement scores. Furthermore, we show that the utterance-level engagement scores can be learned from data. These scores can improve automatic evaluation metrics for open-domain dialogue systems, as shown by correlation with human judgements. This suggests that predictive engagement can be used as a real-time feedback for training better dialogue models.

無監督 · MoDELS · Networking · 變換 · AIM ·

2019 年 3 月 27 日

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Guo-Jun Qi,Jiebo Luo

Small data challenges have emerged in many learning problems, since the success of deep neural networks often relies on the availability of a huge amount of labeled data that is expensive to collect. To address it, many efforts have been made on training complex models with small data in an unsupervised and semi-supervised fashion. In this paper, we will review the recent progresses on these two major categories of methods. A wide spectrum of small data models will be categorized in a big picture, where we will show how they interplay with each other to motivate explorations of new ideas. We will review the criteria of learning the transformation equivariant, disentangled, self-supervised and semi-supervised representations, which underpin the foundations of recent developments. Many instantiations of unsupervised and semi-supervised generative models have been developed on the basis of these criteria, greatly expanding the territory of existing autoencoders, generative adversarial nets (GANs) and other deep networks by exploring the distribution of unlabeled data for more powerful representations. While we focus on the unsupervised and semi-supervised methods, we will also provide a broader review of other emerging topics, from unsupervised and semi-supervised domain adaptation to the fundamental roles of transformation equivariance and invariance in training a wide spectrum of deep networks. It is impossible for us to write an exclusive encyclopedia to include all related works. Instead, we aim at exploring the main ideas, principles and methods in this area to reveal where we are heading on the journey towards addressing the small data challenges in this big data era.