四虎亚洲精品高清在线观看,东京热加勒比中文无码,无码人妻综合精品一区色欲AV,久久久久久亚洲精品影院

Language-based ecosystems (LBE), i.e., software ecosystems based on a single programming language, are very common. Examples include the npm ecosystem for JavaScript, and PyPI for Python. These environments encourage code reuse between packages, and incorporate utilities - package managers - for automatically resolving dependencies. However, the same aspects that make these systems popular - ease of publishing code and importing external code - also create novel security issues, which have so far seen little study. We present an a systematic study of security issues that plague LBEs. These issues are inherent to the ways these ecosystems work and cannot be resolved by fixing software vulnerabilities in either the packages or the utilities, e.g., package manager tools, that build these ecosystems. We systematically characterize recent security attacks from various aspects, including attack strategies, vectors, and goals. Our characterization and in-depth analysis of npm and PyPI ecosystems, which represent the largest LBEs, covering nearly one million packages indicates that these ecosystems make an opportune environment for attackers to incorporate stealthy attacks. Overall, we argue that (i) fully automated detection of malicious packages is likely to be unfeasible; however (ii) tools and metrics that help developers assess the risk of including external dependencies would go a long way toward preventing attacks.

相關內容

TOOLS

關注 1

這個新版本的工具會議系列恢復了從1989年到2012年的50個會議的傳統。工具最初是“面向對象語言和系統的技術”，后來發展到包括軟件技術的所有創新方面。今天許多最重要的軟件概念都是在這里首次引入的。2019年TOOLS 50+1在俄羅斯喀山附近舉行，以同樣的創新精神、對所有與軟件相關的事物的熱情、科學穩健性和行業適用性的結合以及歡迎該領域所有趨勢和社區的開放態度，延續了該系列。官網鏈接： · Engineering · Performer · Better · Less ·

2022 年 2 月 3 日

Multilingual training for Software Engineering

Toufique Ahmed,Premkumar Devanbu

from arxiv, Accepted at International Conference on Software Engineering (ICSE-2022)

Well-trained machine-learning models, which leverage large amounts of open-source software data, have now become an interesting approach to automating many software engineering tasks. Several SE tasks have all been subject to this approach, with performance gradually improving over the past several years with better models and training methods. More, and more diverse, clean, labeled data is better for training; but constructing good-quality datasets is time-consuming and challenging. Ways of augmenting the volume and diversity of clean, labeled data generally have wide applicability. For some languages (e.g., Ruby) labeled data is less abundant; in others (e.g., JavaScript) the available data maybe more focused on some application domains, and thus less diverse. As a way around such data bottlenecks, we present evidence suggesting that human-written code in different languages (which performs the same function), is rather similar, and particularly preserving of identifier naming patterns; we further present evidence suggesting that identifiers are a very important element of training data for software engineering tasks. We leverage this rather fortuitous phenomenon to find evidence that available multilingual training data (across different languages) can be used to amplify performance. We study this for 3 different tasks: code summarization, code retrieval, and function naming. We note that this data-augmenting approach is broadly compatible with different tasks, languages, and machine-learning models.

Performer · 簇 · Extensibility · 圖 · Principle ·

2022 年 2 月 2 日

Flow-based Algorithms for Improving Clusters: A Unifying Framework, Software, and Performance

K. Fountoulakis,M. Liu,D. F. Gleich,M. W. Mahoney

from arxiv, 75 Pages, 21 Figures

Clustering points in a vector space or nodes in a graph is a ubiquitous primitive in statistical data analysis, and it is commonly used for exploratory data analysis. In practice, it is often of interest to "refine" or "improve" a given cluster that has been obtained by some other method. In this survey, we focus on principled algorithms for this cluster improvement problem. Many such cluster improvement algorithms are flow-based methods, by which we mean that operationally they require the solution of a sequence of maximum flow problems on a (typically implicitly) modified data graph. These cluster improvement algorithms are powerful, both in theory and in practice, but they have not been widely adopted for problems such as community detection, local graph clustering, semi-supervised learning, etc. Possible reasons for this are: the steep learning curve for these algorithms; the lack of efficient and easy to use software; and the lack of detailed numerical experiments on real-world data that demonstrate their usefulness. Our objective here is to address these issues. To do so, we guide the reader through the whole process of understanding how to implement and apply these powerful algorithms. We present a unifying fractional programming optimization framework that permits us to distill, in a simple way, the crucial components of all these algorithms. It also makes apparent similarities and differences between related methods. Viewing these cluster improvement algorithms via a fractional programming framework suggests directions for future algorithm development. Finally, we develop efficient implementations of these algorithms in our LocalGraphClustering Python package, and we perform extensive numerical experiments to demonstrate the performance of these methods on social networks and image-based data graphs.

MoDELS · Engineering · 可理解性 · 可辨認的 · Better ·

2022 年 2 月 1 日

Thematic Domain Analysis for Ocean Modeling

Reiner Jung,Sven Gundlach,Wilhelm Hasselbring

from arxiv, 14 pages. arXiv admin note: substantial text overlap with arXiv:2108.08589

Ocean science is a discipline that employs ocean models as an essential research asset. Such scientific modeling provides mathematical abstractions of real-world systems, e.g., the oceans. These models are then coded as implementations of the mathematical abstractions. The developed software systems are called models of the real-world system. To advance the state in engineering such ocean models, we intend to better understand how ocean models are developed and maintained in ocean science. In this paper, we present the results of semi-structured interviews and the Thematic Analysis~(TA) of the interview results to analyze the domain of ocean modeling. Thereby, we identified developer requirements and impediments to model development and evolution, and related themes. This analysis can help to understand where methods from software engineering should be introduced and which challenges need to be addressed. We suggest that other researchers extend and repeat our TA with model developers and research software engineers working in related domains to further advance our knowledge and skills in scientific modeling.

DNS · Networking · Extensibility · 塊 · 原點 ·

2022 年 2 月 1 日

Measuring the Accessibility of Domain Name Encryption and Its Impact on Internet Filtering

Nguyen Phong Hoang,Michalis Polychronakis,Phillipa Gill

from arxiv, To appear in Proceedings of the Passive and Active Measurement Conference 2022

Most online communications rely on DNS to map domain names to their hosting IP address(es). Previous work has shown that DNS-based network interference is widespread due to the unencrypted and unauthenticated nature of the original DNS protocol. In addition to DNS, accessed domain names can also be monitored by on-path observers during the TLS handshake when the SNI extension is used. These lingering issues with exposed plaintext domain names have led to the development of a new generation of protocols that keep accessed domain names hidden. DNS-over-TLS (DoT) and DNS-over-HTTPS (DoH) hide the domain names of DNS queries, while Encrypted Server Name Indication (ESNI) encrypts the domain name in the SNI extension. We present DNEye, a measurement system built on top of a network of distributed vantage points, which we used to study the accessibility of DoT/DoH and ESNI, and to investigate whether these protocols are tampered with by network providers (e.g., for censorship). Moreover, we evaluate the efficacy of these protocols in circumventing network interference when accessing content blocked by traditional DNS manipulation. We find evidence of blocking efforts against domain name encryption technologies in several countries, including China, Russia, and Saudi Arabia. At the same time, we discover that domain name encryption can help with unblocking more than 55% and 95% of censored domains in China and other countries where DNS-based filtering is heavily employed.

INFORMS · 向量化 · state-of-the-art · 評論員 · 可辨認的 ·

2022 年 1 月 31 日

A Review on C3I Systems' Security: Vulnerabilities, Attacks, and Countermeasures

Hussain Ahmad,Isuru Dharmadasa,Faheem Ullah,M. Ali Babar

Command, Control, Communication, and Intelligence (C3I) systems are increasingly used in critical civil and military domains for achieving information superiority, operational efficacy, and greater situational awareness. Unlike traditional systems facing widespread cyber-attacks, the sensitive nature of C3I tactical operations make their cybersecurity a critical concern. For instance, tampering or intercepting confidential information in military battlefields not only damages C3I operations, but also causes irreversible consequences such as loss of human lives and mission failures. Therefore, C3I systems have become a focal point for cyber adversaries. Moreover, technological advancements and modernization of C3I systems have significantly increased the potential risk of cyber-attacks on C3I systems. Consequently, cyber adversaries leverage highly sophisticated attack vectors to exploit security vulnerabilities in C3I systems. Despite the burgeoning significance of cybersecurity for C3I systems, the existing literature lacks a comprehensive review to systematize the body of knowledge on C3I systems' security. Therefore, in this paper, we have gathered, analyzed, and synthesized the state-of-the-art on the cybersecurity of C3I systems. In particular, this paper has identified security vulnerabilities, attack vectors, and countermeasures/defenses for C3I systems. Furthermore, our survey has enabled us to: (i) propose a taxonomy for security vulnerabilities, attack vectors and countermeasures; (ii) interrelate attack vectors with security vulnerabilities and countermeasures; and (iii) propose future research directions for advancing the state-of-the-art on the cybersecurity of C3I systems.

DevOps · 回合 · CASE · Continuity · Elevate ·

2022 年 1 月 30 日

Making Secure Software Insecure without Changing Its Code: The Possibilities and Impacts of Attacks on the DevOps Pipeline

Nicholas Pecka,Lotfi ben Othmane,Altaz Valani

Companies are misled into thinking they solve their security issues by using a DevSecOps system. This paper aims to answer the question: Could a DevOps pipeline be misused to transform a securely developed application into an insecure one? To answer the question, we designed a typical DevOps pipeline utilizing Kubernetes (K8s} as a case study environment and analyzed the applicable threats. Then, we developed four attack scenarios against the case study environment: maliciously abusing the user's privilege of deploying containers within the K8s cluster, abusing the Jenkins instance to modify files during the continuous integration, delivery, and deployment systems (CI/CD) build phase, modifying the K8s DNS layer to expose an internal IP to external traffic, and elevating privileges from an account with create, read, update, and delete (CRUD) privileges to root privileges. The attacks answer the research question positively: companies should design and use a secure DevOps pipeline and not expect that using a DevSecOps environment alone is sufficient to deliver secure software.

查準率/準確率 · 可辨認的 · DSS · 模型可辨識性 · MoDELS ·

2022 年 1 月 30 日

STRIDE-based Cyber Security Threat Modeling for IoT-enabled Precision Agriculture Systems

Md. Rashid Al Asif,Khondokar Fida Hasan,Md Zahidul Islam,Rahamatullah Khondoker

The concept of traditional farming is changing rapidly with the introduction of smart technologies like the Internet of Things (IoT). Under the concept of smart agriculture, precision agriculture is gaining popularity to enable Decision Support System (DSS)-based farming management that utilizes widespread IoT sensors and wireless connectivity to enable automated detection and optimization of resources. Undoubtedly the success of the system would be impacted on crop productivity, where failure would impact severely. Like many other cyber-physical systems, one of the growing challenges to avoid system adversity is to ensure the system's security, privacy, and trust. But what are the vulnerabilities, threats, and security issues we should consider while deploying precision agriculture? This paper has conducted a holistic threat modeling on component levels of precision agriculture's standard infrastructure using popular threat intelligence tools STRIDE to identify common security issues. Our modeling identifies a noticing of fifty-eight potential security threats to consider. This presentation systematically presented them and advised general mitigation suggestions to support cyber security in precision agriculture.

MoDELS · 學成 · Next · Processing（編程語言） · Taxonomy ·

2021 年 8 月 12 日

AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing

Katikapalli Subramanyam Kalyan,Ajit Rajasekharan,Sivanesan Sangeetha

from arxiv, Preprint under review

Transformer-based pretrained language models (T-PTLMs) have achieved great success in almost every NLP task. The evolution of these models started with GPT and BERT. These models are built on the top of transformers, self-supervised learning and transfer learning. Transformed-based PTLMs learn universal language representations from large volumes of text data using self-supervised learning and transfer this knowledge to downstream tasks. These models provide good background knowledge to downstream tasks which avoids training of downstream models from scratch. In this comprehensive survey paper, we initially give a brief overview of self-supervised learning. Next, we explain various core concepts like pretraining, pretraining methods, pretraining tasks, embeddings and downstream adaptation methods. Next, we present a new taxonomy of T-PTLMs and then give brief overview of various benchmarks including both intrinsic and extrinsic. We present a summary of various useful libraries to work with T-PTLMs. Finally, we highlight some of the future research directions which will further improve these models. We strongly believe that this comprehensive survey paper will serve as a good reference to learn the core concepts as well as to stay updated with the recent happenings in T-PTLMs.

Performer · 查全率/召回率 · 查準率/準確率 · INFORMS · Buffer（公司） ·

2018 年 9 月 12 日

Towards security defect prediction with AI

Carson D. Sestili,William S. Snavely,Nathan M. VanHoudnos

In this study, we investigate the limits of the current state of the art AI system for detecting buffer overflows and compare it with current static analysis tools. To do so, we developed a code generator, s-bAbI, capable of producing an arbitrarily large number of code samples of controlled complexity. We found that the static analysis engines we examined have good precision, but poor recall on this dataset, except for a sound static analyzer that has good precision and recall. We found that the state of the art AI system, a memory network modeled after Choi et al. [1], can achieve similar performance to the static analysis engines, but requires an exhaustive amount of training data in order to do so. Our work points towards future approaches that may solve these problems; namely, using representations of code that can capture appropriate scope information and using deep learning methods that are able to perform arithmetic operations.

大數據 · 可理解性 · 復合數據 · Processing（編程語言） · Better ·

2016 年 1 月 15 日

Big Data: Understanding Big Data

Kevin Taylor-Sakyi

from arxiv, 8 pages, Big Data Analytics, Data Storage, MapReduce, Knowledge-Space, Big Data Inconsistencies

Steve Jobs, one of the greatest visionaries of our time was quoted in 1996 saying "a lot of times, people do not know what they want until you show it to them" [38] indicating he advocated products to be developed based on human intuition rather than research. With the advancements of mobile devices, social networks and the Internet of Things, enormous amounts of complex data, both structured and unstructured are being captured in hope to allow organizations to make better business decisions as data is now vital for an organizations success. These enormous amounts of data are referred to as Big Data, which enables a competitive advantage over rivals when processed and analyzed appropriately. However Big Data Analytics has a few concerns including Management of Data-lifecycle, Privacy & Security, and Data Representation. This paper reviews the fundamental concept of Big Data, the Data Storage domain, the MapReduce programming paradigm used in processing these large datasets, and focuses on two case studies showing the effectiveness of Big Data Analytics and presents how it could be of greater good in the future if handled appropriately.