男女一边脱一边亲一边膜_91资源电影网站_18禁美女裸体无遮挡啪啪_亚洲一区二区人妻_亚洲图片综合在线_激情综合亚洲综合小综合_久久精品欧美一级片

from arxiv, This paper has been accepted at the 2021 APWG Symposium on Electronic Crime Research (eCrime). The camera ready version of this work is scheduled to be presented and published at this conference ( December 1st to 3rd, 2021)

Phishing attacks are an increasingly potent web-based threat, with nearly 1.5 million websites created on a monthly basis. In this work, we present the first study towards identifying such attacks through phishing reports shared by users on Twitter. We evaluated over 16.4k such reports posted by 701 Twitter accounts between June to August 2021, which contained 11.1k unique URLs, and analyzed their effectiveness using various quantitative and qualitative measures. Our findings indicate that not only do these users share a high volume of legitimate phishing URLs, but these reports contain more information regarding the phishing websites (which can expedite the process of identifying and removing these threats), when compared to two popular open-source phishing feeds: PhishTank and OpenPhish. We also notice that the reported websites had very little overlap with the URLs existing in the other feeds, and also remained active for longer periods of time. But despite having these attributes, we found that these reports have very low interaction from other Twitter users, especially from the domains and organizations targeted by the reported URLs. Moreover, nearly 31% of these URLs were still active even after a week of them being reported, with 27% of them being detected by very few anti-phishing tools, suggesting that a large majority of these reports remain undiscovered, despite the majority of the follower base of these accounts being security focused users. Thus, this work highlights the effectiveness of the reports, and the benefits of using them as an open source knowledge base for identifying new phishing websites.

相關內容

可辨認的

關注 4

可約的 · Nuance · Less · 推斷 · 在線 ·

2022 年 1 月 17 日

Make Reddit Great Again: Assessing Community Effects of Moderation Interventions on r/The_Donald

Amaury Trujillo,Stefano Cresci

The subreddit r/The_Donald was repeatedly denounced as a toxic and misbehaving online community, reasons for which it faced a sequence of increasingly constraining moderation interventions by Reddit administrators. It was quarantined in June 2019, restricted in February 2020, and finally banned in June 2020, but despite precursory work on the matter, the effects of this sequence of interventions are still unclear. In this work, we follow a multidimensional causal inference approach to study data containing more than 15M posts made in a time frame of 2 years, to examine the effects of such interventions within and without the subreddit. We find that the interventions had strong positive effects toward reducing the activity of problematic users both inside and outside of r/The_Donald. However, the interventions also caused an increase in toxicity and led users to share more polarized and less factual news. Additional findings of our study are that the restriction had stronger effects than the quarantine and that core users of r/The_Donald suffered stronger effects than the other users. Overall, our results provide evidence that the interventions had mixed effects and paint a nuanced picture of the consequences of community-level moderation strategies. We conclude by reflecting on the challenges of policing online platforms and by discussing implications for the design and deployment of moderation interventions.

穩健性 · Automator · 結點 · Performer · CASES ·

2022 年 1 月 14 日

Twins: BFT Systems Made Robust

Shehar Bano,Alberto Sonnino,Andrey Chursin,Dmitri Perelman,Zekun Li,Avery Ching,Dahlia Malkhi

This paper presents Twins, an automated unit test generator of Byzantine attacks. Twins implements three types of Byzantine behaviors: (i) leader equivocation, (ii) double voting, and (iii) losing internal state such as forgetting 'locks' guarding voted values. To emulate interesting attacks by a Byzantine node, it instantiates twin copies of the node instead of one, giving both twins the same identities and network credentials. To the rest of the system, the twins appear indistinguishable from a single node behaving in a 'questionable' manner. Twins can systematically generate Byzantine attack scenarios at scale, execute them in a controlled manner, and examine their behavior. Twins scenarios iterate over protocol rounds and vary the communication patterns among nodes. Twins runs in a production setting within DiemBFT where it can execute 44M Twins-generated scenarios daily. Whereas the system at hand did not manifest errors, subtle safety bugs that were deliberately injected for the purpose of validating the implementation of Twins itself were exposed within minutes. Twins can prevent developers from regressing correctness when updating the codebase, introducing new features, or performing routine maintenance tasks. Twins only requires a thin wrapper over DiemBFT, we thus envision other systems using it. Building on this idea, one new attack and several known attacks against other BFT protocols were materialized as Twins scenarios. In all cases, the target protocols break within fewer than a dozen protocol rounds, hence it is realistic for the Twins approach to expose the problems.

可理解性 · COVID-19 · 有偏 · entity · CASES ·

2022 年 1 月 10 日

Understanding COVID-19 Effects on Mobility: A Community-Engaged Approach

Arun Sharma,Majid Farhadloo,Yan Li,Aditya Kulkarni,Jayant Gupta,Shashi Shekhar

Given aggregated mobile device data, the goal is to understand the impact of COVID-19 policy interventions on mobility. This problem is vital due to important societal use cases, such as safely reopening the economy. Challenges include understanding and interpreting questions of interest to policymakers, cross-jurisdictional variability in choice and time of interventions, the large data volume, and unknown sampling bias. The related work has explored the COVID-19 impact on travel distance, time spent at home, and the number of visitors at different points of interest. However, many policymakers are interested in long-duration visits to high-risk business categories and understanding the spatial selection bias to interpret summary reports. We provide an Entity Relationship diagram, system architecture, and implementation to support queries on long-duration visits in addition to fine resolution device count maps to understand spatial bias. We closely collaborated with policymakers to derive the system requirements and evaluate the system components, the summary reports, and visualizations.

優化器 · Storage · Better · AIM · CASE ·

2021 年 11 月 23 日

A Case Study on Optimization of Warehouses

Veronika Lesch,Patrick B. M. Müller,Moritz Kr?mer,Samuel Kounev,Christian Krupitzer

In warehouses, order picking is known to be the most labor-intensive and costly task in which the employees account for a large part of the warehouse performance. Hence, many approaches exist, that optimize the order picking process based on diverse economic criteria. However, most of these approaches focus on a single economic objective at once and disregard ergonomic criteria in their optimization. Further, the influence of the placement of the items to be picked is underestimated and accordingly, too little attention is paid to the interdependence of these two problems. In this work, we aim at optimizing the storage assignment and the order picking problem within mezzanine warehouse with regards to their reciprocal influence. We propose a customized version of the Non-dominated Sorting Genetic Algorithm II (NSGA-II) for optimizing the storage assignment problem as well as an Ant Colony Optimization (ACO) algorithm for optimizing the order picking problem. Both algorithms incorporate multiple economic and ergonomic constraints simultaneously. Furthermore, the algorithms incorporate knowledge about the interdependence between both problems, aiming to improve the overall warehouse performance. Our evaluation results show that our proposed algorithms return better storage assignments and order pick routes compared to commonly used techniques for the following quality indicators for comparing Pareto fronts: Coverage, Generational Distance, Euclidian Distance, Pareto Front Size, and Inverted Generational Distance. Additionally, the evaluation regarding the interaction of both algorithms shows a better performance when combining both proposed algorithms.

entity · 實體對齊 · Extensibility · 圖 · 知識圖譜 ·

2020 年 7 月 20 日

A Benchmarking Study of Embedding-based Entity Alignment for Knowledge Graphs

Zequn Sun,Qingheng Zhang,Wei Hu,Chengming Wang,Muhao Chen,Farahnaz Akrami,Chengkai Li

from arxiv, Accepted in the 46th International Conference on Very Large Data Bases (VLDB 2020)

Entity alignment seeks to find entities in different knowledge graphs (KGs) that refer to the same real-world object. Recent advancement in KG embedding impels the advent of embedding-based entity alignment, which encodes entities in a continuous embedding space and measures entity similarities based on the learned embeddings. In this paper, we conduct a comprehensive experimental study of this emerging field. We survey 23 recent embedding-based entity alignment approaches and categorize them based on their techniques and characteristics. We also propose a new KG sampling algorithm, with which we generate a set of dedicated benchmark datasets with various heterogeneity and distributions for a realistic evaluation. We develop an open-source library including 12 representative embedding-based entity alignment approaches, and extensively evaluate these approaches, to understand their strengths and limitations. Additionally, for several directions that have not been explored in current approaches, we perform exploratory experiments and report our preliminary findings for future studies. The benchmark datasets, open-source library and experimental results are all accessible online and will be duly maintained.

NMT · Machine Translation · TED · 模型評估 · BLEU ·

2018 年 6 月 7 日

Multi-Source Neural Machine Translation with Missing Data

Yuta Nishimura,Katsuhito Sudoh,Graham Neubig,Satoshi Nakamura

from arxiv, ACL 2018 Workshop on Neural Machine Translation and Generation

Multi-source translation is an approach to exploit multiple inputs (e.g. in two different languages) to increase translation accuracy. In this paper, we examine approaches for multi-source neural machine translation (NMT) using an incomplete multilingual corpus in which some translations are missing. In practice, many multilingual corpora are not complete due to the difficulty to provide translations in all of the relevant languages (for example, in TED talks, most English talks only have subtitles for a small portion of the languages that TED supports). Existing studies on multi-source translation did not explicitly handle such situations. This study focuses on the use of incomplete multilingual corpora in multi-encoder NMT and mixture of NMT experts and examines a very simple implementation where missing source translations are replaced by a special symbol <NULL>. These methods allow us to use incomplete corpora both at training time and test time. In experiments with real incomplete multilingual corpora of TED Talks, the multi-source NMT with the <NULL> tokens achieved higher translation accuracies measured by BLEU than those by any one-to-one NMT systems.

情感分析 · INFORMS · Twitter · Processing（編程語言） · 可辨認的 ·

2018 年 5 月 25 日

A Sentiment Analysis of Breast Cancer Treatment Experiences and Healthcare Perceptions Across Twitter

Eric M. Clark,Ted James,Chris A. Jones,Amulya Alapati,Promise Ukandu,Christopher M. Danforth,Peter Sheridan Dodds

Background: Social media has the capacity to afford the healthcare industry with valuable feedback from patients who reveal and express their medical decision-making process, as well as self-reported quality of life indicators both during and post treatment. In prior work, [Crannell et. al.], we have studied an active cancer patient population on Twitter and compiled a set of tweets describing their experience with this disease. We refer to these online public testimonies as "Invisible Patient Reported Outcomes" (iPROs), because they carry relevant indicators, yet are difficult to capture by conventional means of self-report. Methods: Our present study aims to identify tweets related to the patient experience as an additional informative tool for monitoring public health. Using Twitter's public streaming API, we compiled over 5.3 million "breast cancer" related tweets spanning September 2016 until mid December 2017. We combined supervised machine learning methods with natural language processing to sift tweets relevant to breast cancer patient experiences. We analyzed a sample of 845 breast cancer patient and survivor accounts, responsible for over 48,000 posts. We investigated tweet content with a hedonometric sentiment analysis to quantitatively extract emotionally charged topics. Results: We found that positive experiences were shared regarding patient treatment, raising support, and spreading awareness. Further discussions related to healthcare were prevalent and largely negative focusing on fear of political legislation that could result in loss of coverage. Conclusions: Social media can provide a positive outlet for patients to discuss their needs and concerns regarding their healthcare coverage and treatment needs. Capturing iPROs from online communication can help inform healthcare professionals and lead to more connected and personalized treatment regimens.

INFORMS · 圖像檢索 · INTERACT · 多峰值 · Extensibility ·

2018 年 5 月 8 日

Image Retrieval with Mixed Initiative and Multimodal Feedback

Nils Murrugarra-Llerena,Adriana Kovashka

from arxiv, In submission to BMVC 2018

How would you search for a unique, fashionable shoe that a friend wore and you want to buy, but you didn't take a picture? Existing approaches propose interactive image search as a promising venue. However, they either entrust the user with taking the initiative to provide informative feedback, or give all control to the system which determines informative questions to ask. Instead, we propose a mixed-initiative framework where both the user and system can be active participants, depending on whose initiative will be more beneficial for obtaining high-quality search results. We develop a reinforcement learning approach which dynamically decides which of three interaction opportunities to give to the user: drawing a sketch, providing free-form attribute feedback, or answering attribute-based questions. By allowing these three options, our system optimizes both the informativeness and exploration capabilities allowing faster image retrieval. We outperform three baselines on three datasets and extensive experimental settings.

情感分析 · INFORMS · Processing（編程語言） · MINE · Computational Linguistics ·

2018 年 1 月 23 日

SentiPers: A Sentiment Analysis Corpus for Persian

Pedram Hosseini,Ali Ahmadian Ramaki,Hassan Maleki,Mansoureh Anvari,Seyed Abolghasem Mirroshandel

Sentiment Analysis (SA) is a major field of study in natural language processing, computational linguistics and information retrieval. Interest in SA has been constantly growing in both academia and industry over the recent years. Moreover, there is an increasing need for generating appropriate resources and datasets in particular for low resource languages including Persian. These datasets play an important role in designing and developing appropriate opinion mining platforms using supervised, semi-supervised or unsupervised methods. In this paper, we outline the entire process of developing a manually annotated sentiment corpus, SentiPers, which covers formal and informal written contemporary Persian. To the best of our knowledge, SentiPers is a unique sentiment corpus with such a rich annotation in three different levels including document-level, sentence-level, and entity/aspect-level for Persian. The corpus contains more than 26000 sentences of users opinions from digital product domain and benefits from special characteristics such as quantifying the positiveness or negativity of an opinion through assigning a number within a specific range to any given sentence. Furthermore, we present statistics on various components of our corpus as well as studying the inter-annotator agreement among the annotators. Finally, some of the challenges that we faced during the annotation process will be discussed as well.

Twitter · 情感分析 · AIM · 情感分類 · 流 ·

2015 年 9 月 14 日

Twitter Sentiment Analysis

Afroze Ibrahim Baqapuri

from arxiv, Bachelors Thesis Report

This project addresses the problem of sentiment analysis in twitter; that is classifying tweets according to the sentiment expressed in them: positive, negative or neutral. Twitter is an online micro-blogging and social-networking platform which allows users to write short status updates of maximum length 140 characters. It is a rapidly expanding service with over 200 million registered users - out of which 100 million are active users and half of them log on twitter on a daily basis - generating nearly 250 million tweets per day. Due to this large amount of usage we hope to achieve a reflection of public sentiment by analysing the sentiments expressed in the tweets. Analysing the public sentiment is important for many applications such as firms trying to find out the response of their products in the market, predicting political elections and predicting socioeconomic phenomena like stock exchange. The aim of this project is to develop a functional classifier for accurate and automatic sentiment classification of an unknown tweet stream.