唯美清纯另类亚洲一区二区_日韩少妇人妻VS一区二区三区_国产日韩欧美一区二区三_久热精品视频在线观看2一_精品一区二区三区在视频_99RE热免费精品视频观看_中文乱码字幕一区二区三区

Current developments in the statistics community suggest that modern statistics education should be structured holistically, that is, by allowing students to work with real data and to answer concrete statistical questions, but also by educating them about alternative frameworks, such as Bayesian inference. In this article, we describe how we incorporated such a holistic structure in a Bayesian research project on ordered binomial probabilities. The project was conducted with a group of three undergraduate psychology students who had basic knowledge of Bayesian statistics and programming, but lacked formal mathematical training. The research project aimed to (1) convey the basic mathematical concepts of Bayesian inference; (2) have students experience the entire empirical cycle including collection, analysis, and interpretation of data and (3) teach students open science practices.

相關內容

統計量

關注 3

學習器 · 泛函 · 學成 · Performer · 估計/估計量 ·

2022 年 4 月 19 日

Practical considerations for specifying a super learner

Rachael V. Phillips,Mark J. van der Laan,Hana Lee,Susan Gruber

from arxiv, This article has been submitted for publication as an Education Corner article in the International Journal of Epidemiology published by Oxford University Press

Common tasks encountered in epidemiology, including disease incidence estimation and causal inference, rely on predictive modeling. Constructing a predictive model can be thought of as learning a prediction function, i.e., a function that takes as input covariate data and outputs a predicted value. Many strategies for learning these functions from data are available, from parametric regressions to machine learning algorithms. It can be challenging to choose an approach, as it is impossible to know in advance which one is the most suitable for a particular dataset and prediction task at hand. The super learner (SL) is an algorithm that alleviates concerns over selecting the one "right" strategy while providing the freedom to consider many of them, such as those recommended by collaborators, used in related research, or specified by subject-matter experts. It is an entirely pre-specified and data-adaptive strategy for predictive modeling. To ensure the SL is well-specified for learning the prediction function, the analyst does need to make a few important choices. In this Education Corner article, we provide step-by-step guidelines for making these choices, walking the reader through each of them and providing intuition along the way. In doing so, we aim to empower the analyst to tailor the SL specification to their prediction task, thereby ensuring their SL performs as well as possible. A flowchart provides a concise, easy-to-follow summary of key suggestions and heuristics, based on our accumulated experience, and guided by theory.

INFORMS · COVID-19 · 可辨認的 · Processing（編程語言） · 設計 ·

2022 年 4 月 19 日

Where Was COVID-19 First Discovered? Designing a Question-Answering System for Pandemic Situations

Johannes Graf,Gino Lancho,Patrick Zschech,Kai Heinrich

from arxiv, Preprint accepted for archival and presentation at the 30th European Conference on Information Systems (ECIS 2022)

The COVID-19 pandemic is accompanied by a massive "infodemic" that makes it hard to identify concise and credible information for COVID-19-related questions, like incubation time, infection rates, or the effectiveness of vaccines. As a novel solution, our paper is concerned with designing a question-answering system based on modern technologies from natural language processing to overcome information overload and misinformation in pandemic situations. To carry out our research, we followed a design science research approach and applied Ingwersen's cognitive model of information retrieval interaction to inform our design process from a socio-technical lens. On this basis, we derived prescriptive design knowledge in terms of design requirements and design principles, which we translated into the construction of a prototypical instantiation. Our implementation is based on the comprehensive CORD-19 dataset, and we demonstrate our artifact's usefulness by evaluating its answer quality based on a sample of COVID-19 questions labeled by biomedical experts.

TinyML · Vision · Engineering · ML · 可辨認的 ·

2022 年 4 月 19 日

Software Engineering Approaches for TinyML based IoT Embedded Vision: A Systematic Literature Review

Shashank Bangalore Lakshman,Nasir U. Eisty

from arxiv, 8 pages, 3 figures

Internet of Things (IoT) has catapulted human ability to control our environments through ubiquitous sensing, communication, computation, and actuation. Over the past few years, IoT has joined forces with Machine Learning (ML) to embed deep intelligence at the far edge. TinyML (Tiny Machine Learning) has enabled the deployment of ML models for embedded vision on extremely lean edge hardware, bringing the power of IoT and ML together. However, TinyML powered embedded vision applications are still in a nascent stage, and they are just starting to scale to widespread real-world IoT deployment. To harness the true potential of IoT and ML, it is necessary to provide product developers with robust, easy-to-use software engineering (SE) frameworks and best practices that are customized for the unique challenges faced in TinyML engineering. Through this systematic literature review, we aggregated the key challenges reported by TinyML developers and identified state-of-art SE approaches in large-scale Computer Vision, Machine Learning, and Embedded Systems that can help address key challenges in TinyML based IoT embedded vision. In summary, our study draws synergies between SE expertise that embedded systems developers and ML developers have independently developed to help address the unique challenges in the engineering of TinyML based IoT embedded vision.

TOOLS · Extensibility · INFORMS · 會議 · AIM ·

2022 年 4 月 17 日

How are Software Repositories Mined? A Systematic Literature Review of Workflows, Methodologies, Reproducibility, and Tools

Adam Tutko,Austin Z. Henley,Audris Mockus

from arxiv, 11 Pages

With the advent of open source software, a veritable treasure trove of previously proprietary software development data was made available. This opened the field of empirical software engineering research to anyone in academia. Data that is mined from software projects, however, requires extensive processing and needs to be handled with utmost care to ensure valid conclusions. Since the software development practices and tools have changed over two decades, we aim to understand the state-of-the-art research workflows and to highlight potential challenges. We employ a systematic literature review by sampling over one thousand papers from leading conferences and by analyzing the 286 most relevant papers from the perspective of data workflows, methodologies, reproducibility, and tools. We found that an important part of the research workflow involving dataset selection was particularly problematic, which raises questions about the generality of the results in existing literature. Furthermore, we found a considerable number of papers provide little or no reproducibility instructions -- a substantial deficiency for a data-intensive field. In fact, 33% of papers provide no information on how their data was retrieved. Based on these findings, we propose ways to address these shortcomings via existing tools and also provide recommendations to improve research workflows and the reproducibility of research.

統計量 · 規范化的 · Networks · 標注 · MoDELS ·

2022 年 4 月 17 日

Free gs-monoidal categories and free Markov categories

Tobias Fritz,Wendong Liang

from arxiv, 35 pages. v2: references updated

Categorical probability has recently seen significant advances through the formalism of Markov categories, within which several classical theorems have been proven in entirely abstract categorical terms. Closely related to Markov categories are gs-monoidal categories, also known as CD categories. These omit a condition that implements the normalization of probability. Extending work of Corradini and Gadducci, we construct free gs-monoidal and free Markov categories generated by a collection of morphisms of arbitrary arity and coarity. For free gs-monoidal categories, this comes in the form of an explicit combinatorial description of their morphisms as structured cospans of labeled hypergraphs. These can be thought of as a formalization of gs-monoidal string diagrams ($=$term graphs) as a combinatorial data structure. We formulate the appropriate $2$-categorical universal property based on ideas of Walters and prove that our categories satisfy it. We expect our free categories to be relevant for computer implementations and we also argue that they can be used as statistical causal models generalizing Bayesian networks.

可辨認的 · Processing（編程語言） · NLP · 語言處理 · 自然語言處理 ·

2022 年 4 月 17 日

Natural Language Processing in-and-for Design Research

L Siddharth,Lucienne T. M. Blessing,Jianxi Luo

We review the scholarly contributions that utilise Natural Language Processing (NLP) techniques to support the design process. Using a heuristic approach, we gathered 223 articles that are published in 32 journals within the period 1991-present. We present state-of-the-art NLP in-and-for design research by reviewing these articles according to the type of natural language text sources: internal reports, design concepts, discourse transcripts, technical publications, consumer opinions, and others. Upon summarizing and identifying the gaps in these contributions, we utilise an existing design innovation framework to identify the applications that are currently being supported by NLP. We then propose a few methodological and theoretical directions for future NLP in-and-for design research.

數據集 · 學成 · 約束 · 強化學習 · Machine Learning ·

2022 年 4 月 15 日

Resource-Constrained Neural Architecture Search on Tabular Datasets

Chengrun Yang,Gabriel Bender,Hanxiao Liu,Pieter-Jan Kindermans,Madeleine Udell,Yifeng Lu,Quoc Le,Da Huang

from arxiv, 26 pages, 15 figures, 4 tables

The best neural architecture for a given machine learning problem depends on many factors: not only the complexity and structure of the dataset, but also on resource constraints including latency, compute, energy consumption, etc. Neural architecture search (NAS) for tabular datasets is an important but under-explored problem. Previous NAS algorithms designed for image search spaces incorporate resource constraints directly into the reinforcement learning rewards. In this paper, we argue that search spaces for tabular NAS pose considerable challenges for these existing reward-shaping methods, and propose a new reinforcement learning (RL) controller to address these challenges. Motivated by rejection sampling, when we sample candidate architectures during a search, we immediately discard any architecture that violates our resource constraints. We use a Monte-Carlo-based correction to our RL policy gradient update to account for this extra filtering step. Results on several tabular datasets show TabNAS, the proposed approach, efficiently finds high-quality models that satisfy the given resource constraints.

Machine Learning · 學成 · Conformer · Performer · CASES ·

2022 年 4 月 15 日

Characterizing metastable states with the help of machine learning

Pietro Novelli,Luigi Bonati,Massimiliano Pontil,Michele Parrinello

from arxiv, Main text: 10 pages, 4 figures. Supplementary Info: 4 pages, 5, figures

Present-day atomistic simulations generate long trajectories of ever more complex systems. Analyzing these data, discovering metastable states, and uncovering their nature is becoming increasingly challenging. In this paper, we first use the variational approach to conformation dynamics to discover the slowest dynamical modes of the simulations. This allows the different metastable states of the system to be located and organized hierarchically. The physical descriptors that characterize metastable states are discovered by means of a machine learning method. We show in the cases of two proteins, Chignolin and Bovine Pancreatic Trypsin Inhibitor, how such analysis can be effortlessly performed in a matter of seconds. Another strength of our approach is that it can be applied to the analysis of both unbiased and biased simulations.

可辨認的 · Principle · AI · Continuity · 泛化理論 ·

2021 年 10 月 4 日

Trustworthy AI: From Principles to Practices

Bo Li,Peng Qi,Bo Liu,Shuai Di,Jingen Liu,Jiquan Pei,Jinfeng Yi,Bowen Zhou

Fast developing artificial intelligence (AI) technology has enabled various applied systems deployed in the real world, impacting people's everyday lives. However, many current AI systems were found vulnerable to imperceptible attacks, biased against underrepresented groups, lacking in user privacy protection, etc., which not only degrades user experience but erodes the society's trust in all AI systems. In this review, we strive to provide AI practitioners a comprehensive guide towards building trustworthy AI systems. We first introduce the theoretical framework of important aspects of AI trustworthiness, including robustness, generalization, explainability, transparency, reproducibility, fairness, privacy preservation, alignment with human values, and accountability. We then survey leading approaches in these aspects in the industry. To unify the current fragmented approaches towards trustworthy AI, we propose a systematic approach that considers the entire lifecycle of AI systems, ranging from data acquisition to model development, to development and deployment, finally to continuous monitoring and governance. In this framework, we offer concrete action items to practitioners and societal stakeholders (e.g., researchers and regulators) to improve AI trustworthiness. Finally, we identify key opportunities and challenges in the future development of trustworthy AI systems, where we identify the need for paradigm shift towards comprehensive trustworthy AI systems.

優化器 · Extensibility · 最優化 · Automator · Neural Networks ·

2020 年 3 月 12 日

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Tong Yu,Hong Zhu

Since deep neural networks were developed, they have made huge contributions to everyday lives. Machine learning provides more rational advice than humans are capable of in almost every aspect of daily life. However, despite this achievement, the design and training of neural networks are still challenging and unpredictable procedures. To lower the technical thresholds for common users, automated hyper-parameter optimization (HPO) has become a popular topic in both academic and industrial areas. This paper provides a review of the most essential topics on HPO. The first section introduces the key hyper-parameters related to model training and structure, and discusses their importance and methods to define the value range. Then, the research focuses on major optimization algorithms and their applicability, covering their efficiency and accuracy especially for deep learning networks. This study next reviews major services and toolkits for HPO, comparing their support for state-of-the-art searching algorithms, feasibility with major deep learning frameworks, and extensibility for new modules designed by users. The paper concludes with problems that exist when HPO is applied to deep learning, a comparison between optimization algorithms, and prominent approaches for model evaluation with limited computational resources.