爱琴海论坛视频播放三免费,国产日黄色大片一区二区,波野结女一区二区三区高清AV,自拍色综合图第一页区

Multiple Choice examinations are a ubiquitous form of assessment that is used to measure the ability of candidates across various domains and tasks. Maintaining the quality of proposed questions is of great importance to test designers, and therefore newly proposed questions go through several pre-test evaluation stages before they can be deployed into real-world exams. This process is currently quite manual, which can lead to time lags in the question development cycle. Automating this process would lead to a large improvement in efficiency, however, current datasets do not contain sufficient pre-test analysis information. In this paper, we introduce CamChoice; a multiple-choice comprehension dataset with questions at different target levels, where questions have the true candidate selected options distributions. We introduce the task of candidate distribution matching, propose several evaluation metrics for the task, and demonstrate that automatic systems trained on RACE++ can be leveraged as baselines for our task. We further demonstrate that these automatic systems can be used for practical pre-test evaluation tasks such as detecting underperforming distractors, where our detection systems can automatically identify poor distractors that few candidates select. We release the data publicly for future research.

相關內容

Processing（編程語言）

關注 121

Processing 是一門開源編程語言和與之配套的集成開發環境（IDE）的名稱。Processing 在電子藝術和視覺設計社區被用來教授編程基礎，并運用于大量的新媒體和互動藝術作品中。

SAT · 可理解性 · Automator · 可約的 · Extensibility ·

2023 年 8 月 15 日

EduSAT: A Pedagogical Tool for Theory and Applications of Boolean Satisfiability

Yiqi Zhao,Ziyan An,Meiyi Ma,Taylor Johnson

Boolean Satisfiability (SAT) and Satisfiability Modulo Theories (SMT) are widely used in automated verification, but there is a lack of interactive tools designed for educational purposes in this field. To address this gap, we present EduSAT, a pedagogical tool specifically developed to support learning and understanding of SAT and SMT solving. EduSAT offers implementations of key algorithms such as the Davis-Putnam-Logemann-Loveland (DPLL) algorithm and the Reduced Order Binary Decision Diagram (ROBDD) for SAT solving. Additionally, EduSAT provides solver abstractions for five NP-complete problems beyond SAT and SMT. Users can benefit from EduSAT by experimenting, analyzing, and validating their understanding of SAT and SMT solving techniques. Our tool is accompanied by comprehensive documentation and tutorials, extensive testing, and practical features such as a natural language interface and SAT and SMT formula generators, which also serve as a valuable opportunity for learners to deepen their understanding. Our evaluation of EduSAT demonstrates its high accuracy, achieving 100% correctness across all the implemented SAT and SMT solvers. We release EduSAT as a python package in .whl file, and the source can be identified at //github.com/zhaoy37/SAT_Solver.

回合 · 可辨認的 · AIM · 可理解性 · TOOLS ·

2023 年 8 月 15 日

SoK: A Systematic Review of TEE Usage for Developing Trusted Applications

Arttu Paju,Muhammad Owais Javed,Juha Nurmi,Juha Savim?ki,Brian McGillion,Billy Bob Brumley

from arxiv, In The 18th International Conference on Availability, Reliability and Security (ARES 2023), August 29 -- September 01, 2023, Benevento, Italy. 15 pages

Trusted Execution Environments (TEEs) are a feature of modern central processing units (CPUs) that aim to provide a high assurance, isolated environment in which to run workloads that demand both confidentiality and integrity. Hardware and software components in the CPU isolate workloads, commonly referred to as Trusted Applications (TAs), from the main operating system (OS). This article aims to analyse the TEE ecosystem, determine its usability, and suggest improvements where necessary to make adoption easier. To better understand TEE usage, we gathered academic and practical examples from a total of 223 references. We summarise the literature and provide a publication timeline, along with insights into the evolution of TEE research and deployment. We categorise TAs into major groups and analyse the tools available to developers. Lastly, we evaluate trusted container projects, test performance, and identify the requirements for migrating applications inside them.

Networking · 樣例 · ML · Learning · Automator ·

2023 年 8 月 13 日

SoK: Realistic Adversarial Attacks and Defenses for Intelligent Network Intrusion Detection

Jo?o Vitorino,Isabel Pra?a,Eva Maia

from arxiv, 31 pages, 3 tables, 6 figures, Computers and Security journal

Machine Learning (ML) can be incredibly valuable to automate anomaly detection and cyber-attack classification, improving the way that Network Intrusion Detection (NID) is performed. However, despite the benefits of ML models, they are highly susceptible to adversarial cyber-attack examples specifically crafted to exploit them. A wide range of adversarial attacks have been created and researchers have worked on various defense strategies to safeguard ML models, but most were not intended for the specific constraints of a communication network and its communication protocols, so they may lead to unrealistic examples in the NID domain. This Systematization of Knowledge (SoK) consolidates and summarizes the state-of-the-art adversarial learning approaches that can generate realistic examples and could be used in real ML development and deployment scenarios with real network traffic flows. This SoK also describes the open challenges regarding the use of adversarial ML in the NID domain, defines the fundamental properties that are required for an adversarial example to be realistic, and provides guidelines for researchers to ensure that their future experiments are adequate for a real communication network.

Copulas · 近似 · 易處理的 · 精確推斷 · 方差減小 ·

2023 年 8 月 12 日

Flexible Variational Bayes based on a Copula of a Mixture

David Gunawan,Robert Kohn,David Nott

Variational Bayes methods approximate the posterior density by a family of tractable distributions whose parameters are estimated by optimisation. Variational approximation is useful when exact inference is intractable or very costly. Our article develops a flexible variational approximation based on a copula of a mixture, which is implemented by combining boosting, natural gradient, and a variance reduction method. The efficacy of the approach is illustrated by using simulated and real datasets to approximate multimodal, skewed and heavy-tailed posterior distributions, including an application to Bayesian deep feedforward neural network regression models.

Analysis · WEB · Performer · Google · Extensibility ·

2023 年 8 月 11 日

Through the Lens of Google CrUX: Dissecting Web Browsing Experience Across Devices and Countries

Jayasree Sengupta,Tanya Shreedhar,Dinh Nguyen,Robert Kramer,Vaibhav Bajpai

from arxiv, 7 pages, 4 figures and 1 table

User quality of experience in the context of Web browsing is being researched widely, with plenty of developments occurring alongside technological advances, not seldom driven by big industry players. With the huge reach and infrastructure of Google, the Chrome User Experience Report (CrUX) provides quantitative real-life measurement data of a vast magnitude. Analysis of this steadily expanding dataset aggregating different user experience metrics, yields tangible insights into actual trends and developments. Hence, this paper is the first to study the CrUX dataset from the viewpoint of relevant metrics by quantitative evaluation of users Web browsing experience across three device types and nine European countries. Analysis of data segmented by connection type in the device dimension shows desktops outperforming other device types for all metrics. Similar analysis in the country dimension, shows North European countries (Sweden, Finland) having maximum 4G connections (85.99%, 81.41% respectively) and steadily performing 25%-36% better at the 75th percentile across all metrics compared to the worst performing country. Such a high-level longitudinal analysis of real-life Web browsing experience provides an extensive base for future research.

學習器 · MoDELS · 搜索引擎營銷 · Machine Learning · 路徑 ·

2023 年 8 月 11 日

SLEM: Machine Learning for Path Modeling and Causal Inference with Super Learner Equation Modeling

Matthew J. Vowels

Causal inference is a crucial goal of science, enabling researchers to arrive at meaningful conclusions regarding the predictions of hypothetical interventions using observational data. Path models, Structural Equation Models (SEMs), and, more generally, Directed Acyclic Graphs (DAGs), provide a means to unambiguously specify assumptions regarding the causal structure underlying a phenomenon. Unlike DAGs, which make very few assumptions about the functional and parametric form, SEM assumes linearity. This can result in functional misspecification which prevents researchers from undertaking reliable effect size estimation. In contrast, we propose Super Learner Equation Modeling, a path modeling technique integrating machine learning Super Learner ensembles. We empirically demonstrate its ability to provide consistent and unbiased estimates of causal effects, its competitive performance for linear models when compared with SEM, and highlight its superiority over SEM when dealing with non-linear relationships. We provide open-source code, and a tutorial notebook with example usage, accentuating the easy-to-use nature of the method.

分解的 · 可辨認的 · TAP · Things · 控制器 ·

2023 年 8 月 11 日

Tapping into Privacy: A Study of User Preferences and Concerns on Trigger-Action Platforms

Piero Romare,Victor Morel,Farzaneh Karegar,Simone Fischer-Hübner

from arxiv, Accepted for publication at PST 2023

The Internet of Things (IoT) devices are rapidly increasing in popularity, with more individuals using Internet-connected devices that continuously monitor their activities. This work explores privacy concerns and expectations of end-users related to Trigger-Action platforms (TAPs) in the context of the Internet of Things (IoT). TAPs allow users to customize their smart environments by creating rules that trigger actions based on specific events or conditions. As personal data flows between different entities, there is a potential for privacy concerns. In this study, we aimed to identify the privacy factors that impact users' concerns and preferences for using IoT TAPs. To address this research objective, we conducted three focus groups with 15 participants and we extracted nine themes related to privacy factors using thematic analysis. Our participants particularly prefer to have control and transparency over the automation and are concerned about unexpected data inferences, risks and unforeseen consequences for themselves and for bystanders that are caused by the automation. The identified privacy factors can help researchers derive predefined and selectable profiles of privacy permission settings for IoT TAPs that represent the privacy preferences of different types of users as a basis for designing usable privacy controls for IoT TAPs.

數據集 · Automator · 標注 · Taxonomy · PubMed ·

2023 年 8 月 11 日

Ontology Enrichment from Texts: A Biomedical Dataset for Concept Discovery and Placement

Hang Dong,Jiaoyan Chen,Yuan He,Ian Horrocks

from arxiv, 5 pages, 1 figure, accepted for CIKM 2023. The dataset, data construction scripts, and baseline implementation are available at //zenodo.org/record/8228005 (Zenodo) and //github.com/KRR-Oxford/OET (GitHub)

Mentions of new concepts appear regularly in texts and require automated approaches to harvest and place them into Knowledge Bases (KB), e.g., ontologies and taxonomies. Existing datasets suffer from three issues, (i) mostly assuming that a new concept is pre-discovered and cannot support out-of-KB mention discovery; (ii) only using the concept label as the input along with the KB and thus lacking the contexts of a concept label; and (iii) mostly focusing on concept placement w.r.t a taxonomy of atomic concepts, instead of complex concepts, i.e., with logical operators. To address these issues, we propose a new benchmark, adapting MedMentions dataset (PubMed abstracts) with SNOMED CT versions in 2014 and 2017 under the Diseases sub-category and the broader categories of Clinical finding, Procedure, and Pharmaceutical / biologic product. We provide usage on the evaluation with the dataset for out-of-KB mention discovery and concept placement, adapting recent Large Language Model based methods.

Learning · Pattern Recognition · 可理解性 · 深度學習 · 模型構建 ·

2022 年 9 月 14 日

A Review and Roadmap of Deep Learning Causal Discovery in Different Variable Paradigms

Hang Chen,Keqing Du,Xinyu Yang,Chenguang Li

from arxiv, 26 pages,10 figures. arXiv admin note: text overlap with arXiv:2012.07138, arXiv:1605.08179, arXiv:2203.14237 by other authors

Understanding causality helps to structure interventions to achieve specific goals and enables predictions under interventions. With the growing importance of learning causal relationships, causal discovery tasks have transitioned from using traditional methods to infer potential causal structures from observational data to the field of pattern recognition involved in deep learning. The rapid accumulation of massive data promotes the emergence of causal search methods with brilliant scalability. Existing summaries of causal discovery methods mainly focus on traditional methods based on constraints, scores and FCMs, there is a lack of perfect sorting and elaboration for deep learning-based methods, also lacking some considers and exploration of causal discovery methods from the perspective of variable paradigms. Therefore, we divide the possible causal discovery tasks into three types according to the variable paradigm and give the definitions of the three tasks respectively, define and instantiate the relevant datasets for each task and the final causal model constructed at the same time, then reviews the main existing causal discovery methods for different tasks. Finally, we propose some roadmaps from different perspectives for the current research gaps in the field of causal discovery and point out future research directions.

SOFT · 硬性注意力 · 注意力機制 · Performer · MoDELS ·

2018 年 1 月 31 日

Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling

Tao Shen,Tianyi Zhou,Guodong Long,Jing Jiang,Sen Wang,Chengqi Zhang

from arxiv, 12 pages, 3 figures

Many natural language processing tasks solely rely on sparse dependencies between a few tokens in a sentence. Soft attention mechanisms show promising performance in modeling local/global dependencies by soft probabilities between every two tokens, but they are not effective and efficient when applied to long sentences. By contrast, hard attention mechanisms directly select a subset of tokens but are difficult and inefficient to train due to their combinatorial nature. In this paper, we integrate both soft and hard attention into one context fusion model, "reinforced self-attention (ReSA)", for the mutual benefit of each other. In ReSA, a hard attention trims a sequence for a soft self-attention to process, while the soft attention feeds reward signals back to facilitate the training of the hard one. For this purpose, we develop a novel hard attention called "reinforced sequence sampling (RSS)", selecting tokens in parallel and trained via policy gradient. Using two RSS modules, ReSA efficiently extracts the sparse dependencies between each pair of selected tokens. We finally propose an RNN/CNN-free sentence-encoding model, "reinforced self-attention network (ReSAN)", solely based on ReSA. It achieves state-of-the-art performance on both Stanford Natural Language Inference (SNLI) and Sentences Involving Compositional Knowledge (SICK) datasets.