日本人体黄色三级视频_亚洲主播福利视频网_日韩午夜福利免费一级网站免费_免费乱理伦片在线观看夜_一区二区三区污污污欧美群交_国产精品毛片一级十八一区二区_中文字幕日韩精品免费一区二区三区

Data management, which encompasses activities and strategies related to the storage, organization, and description of data and other research materials, helps ensure the usability of datasets -- both for the original research team and for others. When contextualized as part of a research workflow, data management practices can provide an avenue for promoting other practices, including those related to reproducibility and those that fall under the umbrella of open science. Not all research data needs to be shared, but all should be well managed to establish a record of the research process.

相關內容

Storage

關注 3

相似度 · 泛函 · Java · 編程語言 ·

2021 年 11 月 26 日

EOLANG and phi-calculus

Yegor Bugayenko

Object-oriented programming (OOP) is one of the most popular paradigms used for building software systems. However, despite its industrial and academic popularity, OOP is still missing a formal apparatus similar to lambda-calculus, which functional programming is based on. There were a number of attempts to formalize OOP, but none of them managed to cover all the features available in modern OO programming languages, such as C++ or Java. We have made yet another attempt and created phi-calculus. We also created EOLANG (also called EO), an experimental programming language based on phi-calculus.

entity · 相似度 · ONCE · Guidance · 異常點 ·

2021 年 11 月 26 日

A Differentially Private Bayesian Approach to Replication Analysis

Chengxin Yang,Jerome P. Reiter

from arxiv, 27 pages, 5 figures

Replication analysis is widely used in many fields of study. Once a research is published, many other researchers will conduct the same or very similar analysis to confirm the reliability of the published research. However, what if the data is confidential? In particular, if the data sets used for the studies are confidential, we cannot release the results of replication analyses to any entity without the permission to access the data sets, otherwise it may result in serious privacy leakage especially when the published study and replication studies are using similar or common data sets. For example, examining the influence of the treatment on outliers can cause serious leakage of the information about outliers. In this paper, we build two frameworks for replication analysis by a differentially private Bayesian approach. We formalize our questions of interest and illustrates the properties of our methods by a combination of theoretical analysis and simulation to show the feasibility of our approach. We also provide some guidance on the choice of parameters and interpretation of the results.

Automator · Taxonomy · MoDELS · Processing（編程語言） · Better ·

2021 年 11 月 25 日

A taxonomy for quality in simulation-based development and testing of automated driving systems

Barbara Schütt,Markus Steimle,Birte Kramer,Danny Behnecke,Eric Sax

from arxiv, IEEE Access (submitted)

Ensuring the quality of automated driving systems is a major challenge the automotive industry is facing. In this context, quality defines the degree to which an object meets expectations and requirements. Especially, automated vehicles at SAE level 4 and 5 will be expected to operate safely in various contexts and complex situations without misconduct. Thus, a systematic approach is needed to show their safe operation. A way to address this challenge is simulation-based testing as pure physical testing is not feasible. During simulation-based testing, the data used to evaluate the actual quality of an automated driving system are generated using a simulation. However, to rely on these simulation data, the overall simulation, which also includes its simulation models, must provide a certain quality level. This quality level depends on the intended purpose for which the generated simulation data should be used. Therefore, three categories of quality can be considered: quality of the automated driving system and simulation quality, consisting of simulation model quality and scenario quality. Hence, quality must be determined and evaluated in various process steps in developing and testing automated driving systems, the overall simulation, and the simulation models used for the simulation. In this paper, we propose a taxonomy to serve a better understanding of the concept of quality in the development and testing process to have a clear separation and insight where further testing is needed -- both in terms of automated driving systems and simulation, including their simulation models and scenarios used for testing.

回合 · Jupyter · ASSETS · SCAN · Marketplace ·

2021 年 11 月 24 日

Notebook-as-a-VRE (NaaVRE): from private notebooks to a collaborative cloud virtual research environment

Zhiming Zhao,Spiros Koulouzis,Riccardo Bianchi,Siamak Farshidi,Zeshun Shi,Ruyue Xin,Yuandou Wang,Na Li,Yifang Shi,Joris Timmermans,W. Daniel Kissling

Virtual Research Environments (VREs) provide user-centric support in the lifecycle of research activities, e.g., discovering and accessing research assets, or composing and executing application workflows. A typical VRE is often implemented as an integrated environment, which includes a catalog of research assets, a workflow management system, a data management framework, and tools for enabling collaboration among users. Notebook environments, such as Jupyter, allow researchers to rapidly prototype scientific code and share their experiments as online accessible notebooks. Jupyter can support several popular languages that are used by data scientists, such as Python, R, and Julia. However, such notebook environments do not have seamless support for running heavy computations on remote infrastructure or finding and accessing software code inside notebooks. This paper investigates the gap between a notebook environment and a VRE and proposes an embedded VRE solution for the Jupyter environment called Notebook-as-a-VRE (NaaVRE). The NaaVRE solution provides functional components via a component marketplace and allows users to create a customized VRE on top of the Jupyter environment. From the VRE, a user can search research assets (data, software, and algorithms), compose workflows, manage the lifecycle of an experiment, and share the results among users in the community. We demonstrate how such a solution can enhance a legacy workflow that uses Light Detection and Ranging (LiDAR) data from country-wide airborne laser scanning surveys for deriving geospatial data products of ecosystem structure at high resolution over broad spatial extents. This enables users to scale out the processing of multi-terabyte LiDAR point clouds for ecological applications to more data sources in a distributed cloud environment.

TOOLS · state-of-the-art · 可辨認的 · Principle · MoDELS ·

2021 年 11 月 23 日

SoK: Practical Foundations for Software Spectre Defenses

Sunjay Cauligi,Craig Disselkoen,Daniel Moghimi,Gilles Barthe,Deian Stefan

Spectre vulnerabilities violate our fundamental assumptions about architectural abstractions, allowing attackers to steal sensitive data despite previously state-of-the-art countermeasures. To defend against Spectre, developers of verification tools and compiler-based mitigations are forced to reason about microarchitectural details such as speculative execution. In order to aid developers with these attacks in a principled way, the research community has sought formal foundations for speculative execution upon which to rebuild provable security guarantees. This paper systematizes the community's current knowledge about software verification and mitigation for Spectre. We study state-of-the-art software defenses, both with and without associated formal models, and use a cohesive framework to compare the security properties each defense provides. We explore a wide variety of tradeoffs in the expressiveness of formal frameworks, the complexity of defense tools, and the resulting security guarantees. As a result of our analysis, we suggest practical choices for developers of analysis and mitigation tools, and we identify several open problems in this area to guide future work on grounded software defenses.

ML · Machine Learning · 學成 · 設計 · Performer ·

2021 年 2 月 16 日

A Survey of Machine Learning for Computer Architecture and Systems

Nan Wu,Yuan Xie

It has been a long time that computer architecture and systems are optimized to enable efficient execution of machine learning (ML) algorithms or models. Now, it is time to reconsider the relationship between ML and systems, and let ML transform the way that computer architecture and systems are designed. This embraces a twofold meaning: the improvement of designers' productivity, and the completion of the virtuous cycle. In this paper, we present a comprehensive review of work that applies ML for system design, which can be grouped into two major categories, ML-based modelling that involves predictions of performance metrics or some other criteria of interest, and ML-based design methodology that directly leverages ML as the design tool. For ML-based modelling, we discuss existing studies based on their target level of system, ranging from the circuit level to the architecture/system level. For ML-based design methodology, we follow a bottom-up path to review current work, with a scope of (micro-)architecture design (memory, branch prediction, NoC), coordination between architecture/system and workload (resource allocation and management, data center management, and security), compiler, and design automation. We further provide a future vision of opportunities and potential directions, and envision that applying ML for computer architecture and systems would thrive in the community.

CTR · UniFormer · MoDELS · Criteo · 模型評估 ·

2020 年 9 月 12 日

FuxiCTR: An Open Benchmark for Click-Through Rate Prediction

Jieming Zhu,Jinyang Liu,Shuai Yang,Qi Zhang,Xiuqiang He

from arxiv, Feebacks and comments are welcome!

In many applications, such as recommender systems, online advertising, and product search, click-through rate (CTR) prediction is a critical task, because its accuracy has a direct impact on both platform revenue and user experience. In recent years, with the prevalence of deep learning, CTR prediction has been widely studied in both academia and industry, resulting in an abundance of deep CTR models. Unfortunately, there is still a lack of a standardized benchmark and uniform evaluation protocols for CTR prediction. This leads to the non-reproducible and even inconsistent experimental results among these studies. In this paper, we present an open benchmark (namely FuxiCTR) for reproducible research and provide a rigorous comparison of different models for CTR prediction. Specifically, we ran over 4,600 experiments for a total of more than 12,000 GPU hours in a uniform framework to re-evaluate 24 existing models on two widely-used datasets, Criteo and Avazu. Surprisingly, our experiments show that many models have smaller differences than expected and sometimes are even inconsistent with what reported in the literature. We believe that our benchmark could not only allow researchers to gauge the effectiveness of new models conveniently, but also share some good practices to fairly compare with the state of the arts. We will release all the code and benchmark settings.

學成 · 深度學習 · MoDELS · Better · CASES ·

2020 年 3 月 26 日

A Survey of Deep Learning for Scientific Discovery

Maithra Raghu,Eric Schmidt

Over the past few years, we have seen fundamental breakthroughs in core problems in machine learning, largely driven by advances in deep neural networks. At the same time, the amount of data collected in a wide array of scientific domains is dramatically increasing in both size and complexity. Taken together, this suggests many exciting opportunities for deep learning applications in scientific settings. But a significant challenge to this is simply knowing where to start. The sheer breadth and diversity of different deep learning techniques makes it difficult to determine what scientific problems might be most amenable to these methods, or which specific combination of methods might offer the most promising first approach. In this survey, we focus on addressing this central issue, providing an overview of many widely used deep learning models, spanning visual, sequential and graph structured data, associated tasks and different training methods, along with techniques to use deep learning with less data and better interpret these complex models --- two central considerations for many scientific use cases. We also include overviews of the full design process, implementation tips, and links to a plethora of tutorials, research summaries and open-sourced deep learning pipelines and pretrained models, developed by the community. We hope that this survey will help accelerate the use of deep learning across different scientific domains.

優化器 · Extensibility · 最優化 · Automator · Neural Networks ·

2020 年 3 月 12 日

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Tong Yu,Hong Zhu

Since deep neural networks were developed, they have made huge contributions to everyday lives. Machine learning provides more rational advice than humans are capable of in almost every aspect of daily life. However, despite this achievement, the design and training of neural networks are still challenging and unpredictable procedures. To lower the technical thresholds for common users, automated hyper-parameter optimization (HPO) has become a popular topic in both academic and industrial areas. This paper provides a review of the most essential topics on HPO. The first section introduces the key hyper-parameters related to model training and structure, and discusses their importance and methods to define the value range. Then, the research focuses on major optimization algorithms and their applicability, covering their efficiency and accuracy especially for deep learning networks. This study next reviews major services and toolkits for HPO, comparing their support for state-of-the-art searching algorithms, feasibility with major deep learning frameworks, and extensibility for new modules designed by users. The paper concludes with problems that exist when HPO is applied to deep learning, a comparison between optimization algorithms, and prominent approaches for model evaluation with limited computational resources.

TSC · Performer · 學成 · state-of-the-art · 深度學習 ·

2019 年 3 月 14 日

Deep learning for time series classification: a review

H. Ismail Fawaz,G. Forestier,J. Weber,L. Idoumghar,P. Muller

from arxiv, Accepted at Data Mining and Knowledge Discovery

Time Series Classification (TSC) is an important and challenging problem in data mining. With the increase of time series data availability, hundreds of TSC algorithms have been proposed. Among these methods, only a few have considered Deep Neural Networks (DNNs) to perform this task. This is surprising as deep learning has seen very successful applications in the last years. DNNs have indeed revolutionized the field of computer vision especially with the advent of novel deeper architectures such as Residual and Convolutional Neural Networks. Apart from images, sequential data such as text and audio can also be processed with DNNs to reach state-of-the-art performance for document classification and speech recognition. In this article, we study the current state-of-the-art performance of deep learning algorithms for TSC by presenting an empirical study of the most recent DNN architectures for TSC. We give an overview of the most successful deep learning applications in various time series domains under a unified taxonomy of DNNs for TSC. We also provide an open source deep learning framework to the TSC community where we implemented each of the compared approaches and evaluated them on a univariate TSC benchmark (the UCR/UEA archive) and 12 multivariate time series datasets. By training 8,730 deep learning models on 97 time series datasets, we propose the most exhaustive study of DNNs for TSC to date.