亚洲AV午夜成人片精品网站听书_水蜜桃在线精品视频观看_色狠狠色狠狠综合天天老_一区二区在线中国_国产偷亚洲高清在线观看_精品少妇一区二区三区四区_潮喷大喷水系列无码精品视频

We propose a tool, called FuzzingDriver, to generate dictionary tokens for coverage-based greybox fuzzers (CGF) from the codebase of any target program. FuzzingDriver does not add any overhead to the fuzzing job as it is run beforehand. We compared FuzzingDriver to Google dictionaries by fuzzing six open-source targets, and we found that FuzzingDriver consistently achieves higher code coverage in all tests. We also executed eight benchmarks on FuzzBench to demonstrate how utilizing FuzzingDriver's dictionaries can outperform six widely-used CGF fuzzers. In future work, investigating the impact of FuzzingDriver's dictionaries on improving bug coverage might prove important. Video demonstration: //www.youtube.com/watch?v=Y8j_KvfRrI8

相關內容

CGF

關注 0

國際期刊計算(suan)機(ji)圖(tu)形學(xue)(xue)論(lun)壇（CGF）由歐洲圖(tu)形協(xie)會和(he)Wiley（前Blackwell）聯合出版(ban)。CGF是有關計算(suan)機(ji)圖(tu)形學(xue)(xue)的(de)深入技術(shu)文章的(de)領(ling)先(xian)期刊。官網地址：

自動問答 · 知識 (knowledge) · INFORMS · 數據集 · 基準 ·

2022 年 4 月 19 日

Expert Finding in Legal Community Question Answering

Arian Askari,Suzan Verberne,Gabriella Pasi

from arxiv, Accepted at Proceedings of the 44th European Conference on Information Retrieval, ECIR 2022. Please cite the published version

Expert finding has been well-studied in community question answering (QA) systems in various domains. However, none of these studies addresses expert finding in the legal domain, where the goal is for citizens to find lawyers based on their expertise. In the legal domain, there is a large knowledge gap between the experts and the searchers, and the content on the legal QA websites consist of a combination formal and informal communication. In this paper, we propose methods for generating query-dependent textual profiles for lawyers covering several aspects including sentiment, comments, and recency. We combine query-dependent profiles with existing expert finding methods. Our experiments are conducted on a novel dataset gathered from an online legal QA service. We discovered that taking into account different lawyer profile aspects improves the best baseline model. We make our dataset publicly available for future work.

監督 · 標注 · 泛化理論 · 可理解性 · 類別 ·

2022 年 4 月 19 日

Revisiting Vicinal Risk Minimization for Partially Supervised Multi-Label Classification Under Data Scarcity

Nanqing Dong,Jiayi Wang,Irina Voiculescu

from arxiv, Accepted by CVPR 2022 Workshop on Learning with Limited Labelled Data for Image and Video Understanding

Due to the high human cost of annotation, it is non-trivial to curate a large-scale medical dataset that is fully labeled for all classes of interest. Instead, it would be convenient to collect multiple small partially labeled datasets from different matching sources, where the medical images may have only been annotated for a subset of classes of interest. This paper offers an empirical understanding of an under-explored problem, namely partially supervised multi-label classification (PSMLC), where a multi-label classifier is trained with only partially labeled medical images. In contrast to the fully supervised counterpart, the partial supervision caused by medical data scarcity has non-trivial negative impacts on the model performance. A potential remedy could be augmenting the partial labels. Though vicinal risk minimization (VRM) has been a promising solution to improve the generalization ability of the model, its application to PSMLC remains an open question. To bridge the methodological gap, we provide the first VRM-based solution to PSMLC. The empirical results also provide insights into future research directions on partially supervised learning under data scarcity.

INFORMS · COVID-19 · 可辨認的 · Processing（編程語言） · 設計 ·

2022 年 4 月 19 日

Where Was COVID-19 First Discovered? Designing a Question-Answering System for Pandemic Situations

Johannes Graf,Gino Lancho,Patrick Zschech,Kai Heinrich

from arxiv, Preprint accepted for archival and presentation at the 30th European Conference on Information Systems (ECIS 2022)

The COVID-19 pandemic is accompanied by a massive "infodemic" that makes it hard to identify concise and credible information for COVID-19-related questions, like incubation time, infection rates, or the effectiveness of vaccines. As a novel solution, our paper is concerned with designing a question-answering system based on modern technologies from natural language processing to overcome information overload and misinformation in pandemic situations. To carry out our research, we followed a design science research approach and applied Ingwersen's cognitive model of information retrieval interaction to inform our design process from a socio-technical lens. On this basis, we derived prescriptive design knowledge in terms of design requirements and design principles, which we translated into the construction of a prototypical instantiation. Our implementation is based on the comprehensive CORD-19 dataset, and we demonstrate our artifact's usefulness by evaluating its answer quality based on a sample of COVID-19 questions labeled by biomedical experts.

COVID-19 · 微F1 · 基準 · PubMed · 標注 ·

2022 年 4 月 19 日

LitMC-BERT: transformer-based multi-label classification of biomedical literature with an application on COVID-19 literature curation

Qingyu Chen,Jingcheng Du,Alexis Allot,Zhiyong Lu

The rapid growth of biomedical literature poses a significant challenge for curation and interpretation. This has become more evident during the COVID-19 pandemic. LitCovid, a literature database of COVID-19 related papers in PubMed, has accumulated over 180,000 articles with millions of accesses. Approximately 10,000 new articles are added to LitCovid every month. A main curation task in LitCovid is topic annotation where an article is assigned with up to eight topics, e.g., Treatment and Diagnosis. The annotated topics have been widely used both in LitCovid (e.g., accounting for ~18% of total uses) and downstream studies such as network generation. However, it has been a primary curation bottleneck due to the nature of the task and the rapid literature growth. This study proposes LITMC-BERT, a transformer-based multi-label classification method in biomedical literature. It uses a shared transformer backbone for all the labels while also captures label-specific features and the correlations between label pairs. We compare LITMC-BERT with three baseline models on two datasets. Its micro-F1 and instance-based F1 are 5% and 4% higher than the current best results, respectively, and only requires ~18% of the inference time than the Binary BERT baseline. The related datasets and models are available via //github.com/ncbi/ml-transformer.

INFORMS · Performer · Google Home · 代碼 · Siri ·

2022 年 4 月 18 日

Ingredient Extraction from Text in the Recipe Domain

Arkin Dharawat,Chris Doan

from arxiv, 8 pages, 2 figures

In recent years, there has been an increase in the number of devices with virtual assistants (e.g: Siri, Google Home, Alexa) in our living rooms and kitchens. As a result of this, these devices receive several queries about recipes. All these queries will contain terms relating to a "recipe-domain" i.e: they will contain dish-names, ingredients, cooking times, dietary preferences etc. Extracting these recipe-relevant aspects from the query thus becomes important when it comes to addressing the user's information need. Our project focuses on extracting ingredients from such plain-text user utterances. Our best performing model was a fine-tuned BERT which achieved an F1-score of $95.01$. We have released all our code in a GitHub repository.

講稿 · 稀疏連接 · 學成 · 相關系數 · binary ·

2022 年 4 月 16 日

ZeroIn: Characterizing the Data Distributions of Commits in Software Repositories

Kalyan Perumalla,Aradhana Soni,Rupam Dey,Steven Rich

from arxiv, 42 pages, 68 figures, 7 tables

Modern software development is based on a series of rapid incremental changes collaboratively made to large source code repositories by developers with varying experience and expertise levels. The ZeroIn project is aimed at analyzing the metadata of these dynamic phenomena, including the data on repositories, commits, and developers, to rapidly and accurately mark the quality of commits as they arrive at the repositories. In this context, the present article presents a characterization of the software development metadata in terms of distributions of data that best captures the trends in the datasets. Multiple datasets are analyzed for this purpose, including Stack Overflow on developers' features and GitHub data on over 452 million repositories with 16 million commits. This characterization is intended to make it possible to generate multiple synthetic datasets that can be used in training and testing novel machine learning-based solutions to improve the reliability of software even as it evolves. It is also aimed at serving the development process to exploit the latent correlations among many key feature vectors across the aggregate space of repositories and developers. The data characterization of this article is designed to feed into the machine learning components of ZeroIn, including the application of binary classifiers for early flagging of buggy software commits and the development of graph-based learning methods to exploit sparse connectivity among the sets of repositories, commits, and developers.

Performer · GROUP · MoDELS · SimPLe · 知識 (knowledge) ·

2022 年 4 月 15 日

Decoupling Zero-Shot Semantic Segmentation

Jian Ding,Nan Xue,Gui-Song Xia,Dengxin Dai

from arxiv, Accepted by CVPR 2022

Zero-shot semantic segmentation (ZS3) aims to segment the novel categories that have not been seen in the training. Existing works formulate ZS3 as a pixel-level zeroshot classification problem, and transfer semantic knowledge from seen classes to unseen ones with the help of language models pre-trained only with texts. While simple, the pixel-level ZS3 formulation shows the limited capability to integrate vision-language models that are often pre-trained with image-text pairs and currently demonstrate great potential for vision tasks. Inspired by the observation that humans often perform segment-level semantic labeling, we propose to decouple the ZS3 into two sub-tasks: 1) a classagnostic grouping task to group the pixels into segments. 2) a zero-shot classification task on segments. The former task does not involve category information and can be directly transferred to group pixels for unseen classes. The latter task performs at segment-level and provides a natural way to leverage large-scale vision-language models pre-trained with image-text pairs (e.g. CLIP) for ZS3. Based on the decoupling formulation, we propose a simple and effective zero-shot semantic segmentation model, called ZegFormer, which outperforms the previous methods on ZS3 standard benchmarks by large margins, e.g., 22 points on the PASCAL VOC and 3 points on the COCO-Stuff in terms of mIoU for unseen classes. Code will be released at //github.com/dingjiansw101/ZegFormer.

任務對話系統 · INTERACT · 學成 · 話題 · 情景 ·

2022 年 4 月 7 日

Interacting with Non-Cooperative User: A New Paradigm for Proactive Dialogue Policy

Wenqiang Lei,Yao Zhang,Feifan Song,Hongru Liang,Jiaxin Mao,Jiancheng Lv,Zhenglu Yang,Tat-Seng Chua

from arxiv, Accepted to SIGIR 2022

Proactive dialogue system is able to lead the conversation to a goal topic and has advantaged potential in bargain, persuasion and negotiation. Current corpus-based learning manner limits its practical application in real-world scenarios. To this end, we contribute to advance the study of the proactive dialogue policy to a more natural and challenging setting, i.e., interacting dynamically with users. Further, we call attention to the non-cooperative user behavior -- the user talks about off-path topics when he/she is not satisfied with the previous topics introduced by the agent. We argue that the targets of reaching the goal topic quickly and maintaining a high user satisfaction are not always converge, because the topics close to the goal and the topics user preferred may not be the same. Towards this issue, we propose a new solution named I-Pro that can learn Proactive policy in the Interactive setting. Specifically, we learn the trade-off via a learned goal weight, which consists of four factors (dialogue turn, goal completion difficulty, user satisfaction estimation, and cooperative degree). The experimental results demonstrate I-Pro significantly outperforms baselines in terms of effectiveness and interpretability.

估計/估計量 · 可辨認的 · 可約的 · 平均絕對誤差 · 推薦系統 ·

2019 年 7 月 31 日

MeLU: Meta-Learned User Preference Estimator for Cold-Start Recommendation

Hoyeop Lee,Jinbae Im,Seongwon Jang,Hyunsouk Cho,Sehee Chung

from arxiv, Accepted as a full paper at KDD 2019

This paper proposes a recommender system to alleviate the cold-start problem that can estimate user preferences based on only a small number of items. To identify a user's preference in the cold state, existing recommender systems, such as Netflix, initially provide items to a user; we call those items evidence candidates. Recommendations are then made based on the items selected by the user. Previous recommendation studies have two limitations: (1) the users who consumed a few items have poor recommendations and (2) inadequate evidence candidates are used to identify user preferences. We propose a meta-learning-based recommender system called MeLU to overcome these two limitations. From meta-learning, which can rapidly adopt new task with a few examples, MeLU can estimate new user's preferences with a few consumed items. In addition, we provide an evidence candidate selection strategy that determines distinguishing items for customized preference estimation. We validate MeLU with two benchmark datasets, and the proposed model reduces at least 5.92% mean absolute error than two comparative models on the datasets. We also conduct a user study experiment to verify the evidence selection strategy.

entity · Performer · 圖 · 知識圖譜 · 自動問答 ·

2018 年 1 月 16 日

EARL: Joint Entity and Relation Linking for Question Answering over Knowledge Graphs

Mohnish Dubey,Debayan Banerjee,Debanjan Chaudhuri,Jens Lehmann

In order to answer natural language questions over knowledge graphs, most processing pipelines involve entity and relation linking. Traditionally, entity linking and relation linking has been performed either as dependent sequential tasks or independent parallel tasks. In this paper, we propose a framework called "EARL", which performs entity linking and relation linking as a joint single task. EARL uses a graph connection based solution to the problem. We model the linking task as an instance of the Generalised Travelling Salesman Problem (GTSP) and use GTSP approximate algorithm solutions. We later develop EARL which uses a pair-wise graph-distance based solution to the problem.The system determines the best semantic connection between all keywords of the question by referring to a knowledge graph. This is achieved by exploiting the "connection density" between entity candidates and relation candidates. The "connection density" based solution performs at par with the approximate GTSP solution.We have empirically evaluated the framework on a dataset with 5000 questions. Our system surpasses state-of-the-art scores for entity linking task by reporting an accuracy of 0.65 to 0.40 from the next best entity linker.