宁毅静平公主小说免费阅读_日韩1区3区4区第一页_99国产熟女高清久久久久久_免费观看一级在线影片_狠狠综合久久久久精品网站一_中文字幕视频在线观看一区_日本精品亚洲一区二区三亚

Interrupt-driven programs are widely deployed in safety-critical embedded systems to perform hardware and resource dependent data operation tasks. The frequent use of interrupts in these systems can cause race conditions to occur due to interactions between application tasks and interrupt handlers (or two interrupt handlers). Numerous program analysis and testing techniques have been proposed to detect races in multithreaded programs. Little work, however, has addressed race condition problems related to hardware interrupts. In this paper, we present SDRacer, an automated framework that can detect, validate and repair race conditions in interrupt-driven embedded software. It uses a combination of static analysis and symbolic execution to generate input data for exercising the potential races. It then employs virtual platforms to dynamically validate these races by forcing the interrupts to occur at the potential racing points. Finally, it provides repair candidates to eliminate the detected races. We evaluate SDRacer on nine real-world embedded programs written in C language. The results show that SDRacer can precisely detect and successfully fix race conditions.

相關內容

Analysis

關注 2

ChatGPT · 代碼 · Performer · MoDELS · CASES ·

2023 年 7 月 16 日

Unmasking the giant: A comprehensive evaluation of ChatGPT's proficiency in coding algorithms and data structures

Sayed Erfan Arefin,Tasnia Ashrafi Heya,Hasan Al-Qudah,Ynes Ineza,Abdul Serwadda

The transformative influence of Large Language Models (LLMs) is profoundly reshaping the Artificial Intelligence (AI) technology domain. Notably, ChatGPT distinguishes itself within these models, demonstrating remarkable performance in multi-turn conversations and exhibiting code proficiency across an array of languages. In this paper, we carry out a comprehensive evaluation of ChatGPT's coding capabilities based on what is to date the largest catalog of coding challenges. Our focus is on the python programming language and problems centered on data structures and algorithms, two topics at the very foundations of Computer Science. We evaluate ChatGPT for its ability to generate correct solutions to the problems fed to it, its code quality, and nature of run-time errors thrown by its code. Where ChatGPT code successfully executes, but fails to solve the problem at hand, we look into patterns in the test cases passed in order to gain some insights into how wrong ChatGPT code is in these kinds of situations. To infer whether ChatGPT might have directly memorized some of the data that was used to train it, we methodically design an experiment to investigate this phenomena. Making comparisons with human performance whenever feasible, we investigate all the above questions from the context of both its underlying learning models (GPT-3.5 and GPT-4), on a vast array sub-topics within the main topics, and on problems having varying degrees of difficulty.

情景 · 語言模型化 · MoDELS · Prompt · Performer ·

2023 年 7 月 15 日

Leveraging Large Language Models to Generate Answer Set Programs

Adam Ishay,Zhun Yang,Joohyung Lee

from arxiv, 17 pages, KR 2023

Large language models (LLMs), such as GPT-3 and GPT-4, have demonstrated exceptional performance in various natural language processing tasks and have shown the ability to solve certain reasoning problems. However, their reasoning capabilities are limited and relatively shallow, despite the application of various prompting techniques. In contrast, formal logic is adept at handling complex reasoning, but translating natural language descriptions into formal logic is a challenging task that non-experts struggle with. This paper proposes a neuro-symbolic method that combines the strengths of large language models and answer set programming. Specifically, we employ an LLM to transform natural language descriptions of logic puzzles into answer set programs. We carefully design prompts for an LLM to convert natural language descriptions into answer set programs in a step by step manner. Surprisingly, with just a few in-context learning examples, LLMs can generate reasonably complex answer set programs. The majority of errors made are relatively simple and can be easily corrected by humans, thus enabling LLMs to effectively assist in the creation of answer set programs.

Automator · MoDELS · 試驗 · Unstructured · motivation ·

2023 年 7 月 14 日

Jointly Extracting Interventions, Outcomes, and Findings from RCT Reports with LLMs

Somin Wadhwa,Jay DeYoung,Benjamin Nye,Silvio Amir,Byron C. Wallace

from arxiv, Accepted to appear at Machine Learning for Healthcare (MLHC), 2023

Results from Randomized Controlled Trials (RCTs) establish the comparative effectiveness of interventions, and are in turn critical inputs for evidence-based care. However, results from RCTs are presented in (often unstructured) natural language articles describing the design, execution, and outcomes of trials; clinicians must manually extract findings pertaining to interventions and outcomes of interest from such articles. This onerous manual process has motivated work on (semi-)automating extraction of structured evidence from trial reports. In this work we propose and evaluate a text-to-text model built on instruction-tuned Large Language Models (LLMs) to jointly extract Interventions, Outcomes, and Comparators (ICO elements) from clinical abstracts, and infer the associated results reported. Manual (expert) and automated evaluations indicate that framing evidence extraction as a conditional generation task and fine-tuning LLMs for this purpose realizes considerable ($\sim$20 point absolute F1 score) gains over the previous SOTA. We perform ablations and error analyses to assess aspects that contribute to model performance, and to highlight potential directions for further improvements. We apply our model to a collection of published RCTs through mid-2022, and release a searchable database of structured findings: bit.ly/joint-relations-extraction-mlhc

簇 · 分類數據 · 統計量 · 聚類分析 · 情景 ·

2023 年 7 月 14 日

A testing-based approach to assess the clusterability of categorical data

Lianyu Hu,Junjie Dong,Mudi Jiang,Yan Liu,Zengyou He

from arxiv, 19 pages, 13 figures

The objective of clusterability evaluation is to check whether a clustering structure exists within the data set. As a crucial yet often-overlooked issue in cluster analysis, it is essential to conduct such a test before applying any clustering algorithm. If a data set is unclusterable, any subsequent clustering analysis would not yield valid results. Despite its importance, the majority of existing studies focus on numerical data, leaving the clusterability evaluation issue for categorical data as an open problem. Here we present TestCat, a testing-based approach to assess the clusterability of categorical data in terms of an analytical $p$-value. The key idea underlying TestCat is that clusterable categorical data possess many strongly correlated attribute pairs and hence the sum of chi-squared statistics of all attribute pairs is employed as the test statistic for $p$-value calculation. We apply our method to a set of benchmark categorical data sets, showing that TestCat outperforms those solutions based on existing clusterability evaluation methods for numeric data. To the best of our knowledge, our work provides the first way to effectively recognize the clusterability of categorical data in a statistically sound manner.

圖 · Extensibility · 張成子空間 · contrastive · 類別 ·

2023 年 7 月 14 日

Graph Search Trees and Their Leaves

Robert Scheffler

from arxiv, full version of an extended abstract to be published in the Proceedings of the 49th International Workshop on Graph-Theoretic Concepts in Computer Science (WG 2023) in Fribourg

Graph searches and their respective search trees are widely used in algorithmic graph theory. The problem whether a given spanning tree can be a graph search tree has been considered for different searches, graph classes and search tree paradigms. Similarly, the question whether a particular vertex can be visited last by some search has been studied extensively in recent years. We combine these two problems by considering the question whether a vertex can be a leaf of a graph search tree. We show that for particular search trees, including DFS trees, this problem is easy if we allow the leaf to be the first vertex of the search ordering. We contrast this result by showing that the problem becomes hard for many searches, including DFS and BFS, if we forbid the leaf to be the first vertex. Additionally, we present several structural and algorithmic results for search tree leaves of chordal graphs.

總回報 · android · 可辨認的 · Continuity · MoDELS ·

2023 年 7 月 14 日

EavesDroid: Eavesdropping User Behaviors via OS Side-Channels on Smartphones

Quancheng Wang,Ming Tang,Jianming Fu

from arxiv, 15 pages, 25 figures

As the Internet of Things (IoT) continues to evolve, smartphones have become essential components of IoT systems. However, with the increasing amount of personal information stored on smartphones, user privacy is at risk of being compromised by malicious attackers. Although malware detection engines are commonly installed on smartphones against these attacks, attacks that can evade these defenses may still emerge. In this paper, we analyze the return values of system calls on Android smartphones and find two never-disclosed vulnerable return values that can leak fine-grained user behaviors. Based on this observation, we present EavesDroid, an application-embedded side-channel attack on Android smartphones that allows unprivileged attackers to accurately identify fine-grained user behaviors (e.g., viewing messages and playing videos) via on-screen operations. Our attack relies on the correlation between user behaviors and the return values associated with hardware and system resources. While this attack is challenging since these return values are susceptible to fluctuation and misalignment caused by many factors, we show that attackers can eavesdrop on fine-grained user behaviors using a CNN-GRU classification model that adopts min-max normalization and multiple return value fusion. Our experiments on different models and versions of Android smartphones demonstrate that EavesDroid can achieve 98% and 86% inference accuracy for 17 classes of user behaviors in the test set and real-world settings, highlighting the risk of our attack on user privacy. Finally, we recommend effective malware detection, carefully designed obfuscation methods, or restrictions on reading vulnerable return values to mitigate this attack.

2023 年 7 月 13 日

Towards Causal Analysis of Empirical Software Engineering Data: The Impact of Programming Languages on Coding Competitions

Carlo A. Furia,Richard Torkar,Robert Feldt

from arxiv, Added a missing arrow nickname -> size in Figure 5(a)

There is abundant observational data in the software engineering domain, whereas running large-scale controlled experiments is often practically impossible. Thus, most empirical studies can only report statistical correlations -- instead of potentially more insightful and robust causal relations. To support analyzing purely observational data for causal relations, and to assess any differences between purely predictive and causal models of the same data, this paper discusses some novel techniques based on structural causal models (such as directed acyclic graphs of causal Bayesian networks). Using these techniques, one can rigorously express, and partially validate, causal hypotheses; and then use the causal information to guide the construction of a statistical model that captures genuine causal relations -- such that correlation does imply causation. We apply these ideas to analyzing public data about programmer performance in Code Jam, a large world-wide coding contest organized by Google every year. Specifically, we look at the impact of different programming languages on a participant's performance in the contest. While the overall effect associated with programming languages is weak compared to other variables -- regardless of whether we consider correlational or causal links -- we found considerable differences between a purely associational and a causal analysis of the very same data. The takeaway message is that even an imperfect causal analysis of observational data can help answer the salient research questions more precisely and more robustly than with just purely predictive techniques -- where genuine causal effects may be confounded.

多峰值 · Taxonomy · MoDELS · 可理解性 · 有向 ·

2023 年 2 月 9 日

A Comprehensive Survey on Multimodal Recommender Systems: Taxonomy, Evaluation, and Future Directions

Hongyu Zhou,Xin Zhou,Zhiwei Zeng,Lingzi Zhang,Zhiqi Shen

from arxiv, 33 pages, 4 figures

Recommendation systems have become popular and effective tools to help users discover their interesting items by modeling the user preference and item property based on implicit interactions (e.g., purchasing and clicking). Humans perceive the world by processing the modality signals (e.g., audio, text and image), which inspired researchers to build a recommender system that can understand and interpret data from different modalities. Those models could capture the hidden relations between different modalities and possibly recover the complementary information which can not be captured by a uni-modal approach and implicit interactions. The goal of this survey is to provide a comprehensive review of the recent research efforts on the multimodal recommendation. Specifically, it shows a clear pipeline with commonly used techniques in each step and classifies the models by the methods used. Additionally, a code framework has been designed that helps researchers new in this area to understand the principles and techniques, and easily runs the SOTA models. Our framework is located at: //github.com/enoche/MMRec

Machine Translation · 估計/估計量 · 機器翻譯 · MoDELS · 統計量 ·

2022 年 2 月 22 日

An Overview on Machine Translation Evaluation

Lifeng Han

from arxiv, 35 pages, in Chinese

Since the 1950s, machine translation (MT) has become one of the important tasks of AI and development, and has experienced several different periods and stages of development, including rule-based methods, statistical methods, and recently proposed neural network-based learning methods. Accompanying these staged leaps is the evaluation research and development of MT, especially the important role of evaluation methods in statistical translation and neural translation research. The evaluation task of MT is not only to evaluate the quality of machine translation, but also to give timely feedback to machine translation researchers on the problems existing in machine translation itself, how to improve and how to optimise. In some practical application fields, such as in the absence of reference translations, the quality estimation of machine translation plays an important role as an indicator to reveal the credibility of automatically translated target languages. This report mainly includes the following contents: a brief history of machine translation evaluation (MTE), the classification of research methods on MTE, and the the cutting-edge progress, including human evaluation, automatic evaluation, and evaluation of evaluation methods (meta-evaluation). Manual evaluation and automatic evaluation include reference-translation based and reference-translation independent participation; automatic evaluation methods include traditional n-gram string matching, models applying syntax and semantics, and deep learning models; evaluation of evaluation methods includes estimating the credibility of human evaluations, the reliability of the automatic evaluation, the reliability of the test set, etc. Advances in cutting-edge evaluation methods include task-based evaluation, using pre-trained language models based on big data, and lightweight optimisation models using distillation techniques.

state-of-the-art · 知識庫 · 學成 · 基 · 協同過濾 ·

2018 年 3 月 22 日

Learning over Knowledge-Base Embeddings for Recommendation

Yongfeng Zhang,Qingyao Ai,Xu Chen,Pengfei Wang

State-of-the-art recommendation algorithms -- especially the collaborative filtering (CF) based approaches with shallow or deep models -- usually work with various unstructured information sources for recommendation, such as textual reviews, visual images, and various implicit or explicit feedbacks. Though structured knowledge bases were considered in content-based approaches, they have been largely neglected recently due to the availability of vast amount of data, and the learning power of many complex models. However, structured knowledge bases exhibit unique advantages in personalized recommendation systems. When the explicit knowledge about users and items is considered for recommendation, the system could provide highly customized recommendations based on users' historical behaviors. A great challenge for using knowledge bases for recommendation is how to integrated large-scale structured and unstructured data, while taking advantage of collaborative filtering for highly accurate performance. Recent achievements on knowledge base embedding sheds light on this problem, which makes it possible to learn user and item representations while preserving the structure of their relationship with external knowledge. In this work, we propose to reason over knowledge base embeddings for personalized recommendation. Specifically, we propose a knowledge base representation learning approach to embed heterogeneous entities for recommendation. Experimental results on real-world dataset verified the superior performance of our approach compared with state-of-the-art baselines.