97SE亚洲国产综合在线-亚洲日韩中文字幕一级乱码在线播放不卡

The recent advances in natural language processing have yielded many exciting developments in text analysis and language understanding models; however, these models can also be used to track people, bringing severe privacy concerns. In this work, we investigate what individuals can do to avoid being detected by those models while using social media platforms. We ground our investigation in two exposure-risky tasks, stance detection and geotagging. We explore a variety of simple techniques for modifying text, such as inserting typos in salient words, paraphrasing, and adding dummy social media posts. Our experiments show that the performance of BERT-based models fined tuned for stance detection decreases significantly due to typos, but it is not affected by paraphrasing. Moreover, we find that typos have minimal impact on state-of-the-art geotagging models due to their increased reliance on social networks; however, we show that users can deceive those models by interacting with different users, reducing their performance by almost 50%.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · Performer · 深度強化學習 · tuning · 代價 ·

2022 年 9 月 19 日

A Transferable and Automatic Tuning of Deep Reinforcement Learning for Cost Effective Phishing Detection

Orel Lavie,Asaf Shabtai,Gilad Katz

Many challenging real-world problems require the deployment of ensembles multiple complementary learning models to reach acceptable performance levels. While effective, applying the entire ensemble to every sample is costly and often unnecessary. Deep Reinforcement Learning (DRL) offers a cost-effective alternative, where detectors are dynamically chosen based on the output of their predecessors, with their usefulness weighted against their computational cost. Despite their potential, DRL-based solutions are not widely used in this capacity, partly due to the difficulties in configuring the reward function for each new task, the unpredictable reactions of the DRL agent to changes in the data, and the inability to use common performance metrics (e.g., TPR/FPR) to guide the algorithm's performance. In this study we propose methods for fine-tuning and calibrating DRL-based policies so that they can meet multiple performance goals. Moreover, we present a method for transferring effective security policies from one dataset to another. Finally, we demonstrate that our approach is highly robust against adversarial attacks.

可約的 · state-of-the-art · 統計量 · 值域 · 塊 ·

2022 年 9 月 19 日

Towards Standardized Mobility Reports with User-Level Privacy

Alexandra Kapp,Saskia Nu?ez von Voigt,Helena Mihaljevi?,Florian Tschorsch

The importance of human mobility analyses is growing in both research and practice, especially as applications for urban planning and mobility rely on them. Aggregate statistics and visualizations play an essential role as building blocks of data explorations and summary reports, the latter being increasingly released to third parties such as municipal administrations or in the context of citizen participation. However, such explorations already pose a threat to privacy as they reveal potentially sensitive location information, and thus should not be shared without further privacy measures. There is a substantial gap between state-of-the-art research on privacy methods and their utilization in practice. We thus conceptualize a standardized mobility report with differential privacy guarantees and implement it as open-source software to enable a privacy-preserving exploration of key aspects of mobility data in an easily accessible way. Moreover, we evaluate the benefits of limiting user contributions using three data sets relevant to research and practice. Our results show that even a strong limit on user contribution alters the original geospatial distribution only within a comparatively small range, while significantly reducing the error introduced by adding noise to achieve privacy guarantees.

潛在狄利克雷分配 · Analysis · Twitter · 話題 · 可辨認的 ·

2022 年 9 月 19 日

What are People Talking about in #BlackLivesMatter and #StopAsianHate? Exploring and Categorizing Twitter Topics Emerging in Online Social Movements through the Latent Dirichlet Allocation Model

Xin Tong,Yixuan Li,Jiayi Li,Rongqi Bei,Luyao Zhang

from arxiv, Accepted at AAAI and ACM Conference on AI, Ethics, and Society, August 1 to 3, 2022, Oxford, United Kingdom

Minority groups have been using social media to organize social movements that create profound social impacts. Black Lives Matter (BLM) and Stop Asian Hate (SAH) are two successful social movements that have spread on Twitter that promote protests and activities against racism and increase the public's awareness of other social challenges that minority groups face. However, previous studies have mostly conducted qualitative analyses of tweets or interviews with users, which may not comprehensively and validly represent all tweets. Very few studies have explored the Twitter topics within BLM and SAH dialogs in a rigorous, quantified and data-centered approach. Therefore, in this research, we adopted a mixed-methods approach to comprehensively analyze BLM and SAH Twitter topics. We implemented (1) the latent Dirichlet allocation model to understand the top high-level words and topics and (2) open-coding analysis to identify specific themes across the tweets. We collected more than one million tweets with the #blacklivesmatter and #stopasianhate hashtags and compared their topics. Our findings revealed that the tweets discussed a variety of influential topics in depth, and social justice, social movements, and emotional sentiments were common topics in both movements, though with unique subtopics for each movement. Our study contributes to the topic analysis of social movements on social media platforms in particular and the literature on the interplay of AI, ethics, and society in general.

可理解性 · Continuity · Obvious · 通用智能 · 講稿 ·

2022 年 9 月 18 日

Achilles Heels for AGI/ASI via Decision Theoretic Adversaries

Stephen Casper

from arxiv, Contact info for author at stephencasper.com

As progress in AI continues to advance, it is crucial to know how advanced systems will make choices and in what ways they may fail. Machines can already outsmart humans in some domains, and understanding how to safely build ones which may have capabilities at or above the human level is of particular concern. One might suspect that artificially generally intelligent (AGI) and artificially superintelligent (ASI) systems should be modeled as as something which humans, by definition, can't reliably outsmart. As a challenge to this assumption, this paper presents the Achilles Heel hypothesis which states that even a potentially superintelligent system may nonetheless have stable decision-theoretic delusions which cause them to make obviously irrational decisions in adversarial settings. In a survey of relevant dilemmas and paradoxes from the decision theory literature, a number of these potential Achilles Heels are discussed in context of this hypothesis. Several novel contributions are made toward understanding the ways in which these weaknesses might be implanted into a system.

Agent · MoDELS · 置信度 · INTERACT · Extensibility ·

2022 年 9 月 18 日

Probabilistic Population Protocol Models

Vladyslav Melnychuk

Population protocols are a relatively novel computational model in which very resource-limited anonymous agents interact in pairs with the goal of computing predicates. We consider the probabilistic version of this model, which naturally allows to consider the setup in which a small probability of an incorrect output is tolerated. The main focus of this thesis is the question of confident leader election, which is an extension of the regular leader election problem with an extra requirement for the eventual leader to detect its uniqueness. Having a confident leader allows the population protocols to determine the convergence of its computations. This behaviour of the model is highly beneficial and was shown to be feasible when the original model is extended in various ways. We show that it takes a linear in terms of the population size number of interactions for a probabilistic population protocol to have a non-zero fraction of agents in all reachable states, starting from a configuration with all agents in the same state. This leads us to a conclusion that confident leader election is out of reach even with the probabilistic version of the model.

Attention · Twitter · MoDELS · 可辨認的 · Networking ·

2022 年 9 月 15 日

Hierarchical Attention Network for Explainable Depression Detection on Twitter Aided by Metaphor Concept Mappings

Sooji Han,Rui Mao,Erik Cambria

Automatic depression detection on Twitter can help individuals privately and conveniently understand their mental health status in the early stages before seeing mental health professionals. Most existing black-box-like deep learning methods for depression detection largely focused on improving classification performance. However, explaining model decisions is imperative in health research because decision-making can often be high-stakes and life-and-death. Reliable automatic diagnosis of mental health problems including depression should be supported by credible explanations justifying models' predictions. In this work, we propose a novel explainable model for depression detection on Twitter. It comprises a novel encoder combining hierarchical attention mechanisms and feed-forward neural networks. To support psycholinguistic studies, our model leverages metaphorical concept mappings as input. Thus, it not only detects depressed individuals, but also identifies features of such users' tweets and associated metaphor concept mappings.

數據集 · 集成 · 硬投票 · 軟投票 · 可辨認的 ·

2022 年 9 月 15 日

BERT-based Ensemble Approaches for Hate Speech Detection

Khouloud Mnassri,Praboda Rajapaksha,Reza Farahbakhsh,Noel Crespi

With the freedom of communication provided in online social media, hate speech has increasingly generated. This leads to cyber conflicts affecting social life at the individual and national levels. As a result, hateful content classification is becoming increasingly demanded for filtering hate content before being sent to the social networks. This paper focuses on classifying hate speech in social media using multiple deep models that are implemented by integrating recent transformer-based language models such as BERT, and neural networks. To improve the classification performances, we evaluated with several ensemble techniques, including soft voting, maximum value, hard voting and stacking. We used three publicly available Twitter datasets (Davidson, HatEval2019, OLID) that are generated to identify offensive languages. We fused all these datasets to generate a single dataset (DHO dataset), which is more balanced across different labels, to perform multi-label classification. Our experiments have been held on Davidson dataset and the DHO corpora. The later gave the best overall results, especially F1 macro score, even it required more resources (time execution and memory). The experiments have shown good results especially the ensemble models, where stacking gave F1 score of 97% on Davidson dataset and aggregating ensembles 77% on the DHO dataset.

Extensibility · INTERACT · INFORMS · TEAM · 優化器 ·

2022 年 2 月 21 日

The Role of Heterogeneity in Autonomous Perimeter Defense Problems

Aviv Adler,Oscar Mickelin,Ragesh K. Ramachandran,Gaurav S. Sukhatme,Sertac Karaman

from arxiv, 27 pages, 9 figures

When is heterogeneity in the composition of an autonomous robotic team beneficial and when is it detrimental? We investigate and answer this question in the context of a minimally viable model that examines the role of heterogeneous speeds in perimeter defense problems, where defenders share a total allocated speed budget. We consider two distinct problem settings and develop strategies based on dynamic programming and on local interaction rules. We present a theoretical analysis of both approaches and our results are extensively validated using simulations. Interestingly, our results demonstrate that the viability of heterogeneous teams depends on the amount of information available to the defenders. Moreover, our results suggest a universality property: across a wide range of problem parameters the optimal ratio of the speeds of the defenders remains nearly constant.

穩健性 · MoDELS · Continuity · Taxonomy · 聯邦學習 ·

2020 年 12 月 7 日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Lingjuan Lyu,Han Yu,Xingjun Ma,Lichao Sun,Jun Zhao,Qiang Yang,Philip S. Yu

from arxiv, arXiv admin note: text overlap with arXiv:2003.02133; text overlap with arXiv:1911.11815 by other authors

As data are increasingly being stored in different silos and societies becoming more aware of data privacy issues, the traditional centralized training of artificial intelligence (AI) models is facing efficiency and privacy challenges. Recently, federated learning (FL) has emerged as an alternative solution and continue to thrive in this new reality. Existing FL protocol design has been shown to be vulnerable to adversaries within or outside of the system, compromising data privacy and system robustness. Besides training powerful global models, it is of paramount importance to design FL systems that have privacy guarantees and are resistant to different types of adversaries. In this paper, we conduct the first comprehensive survey on this topic. Through a concise introduction to the concept of FL, and a unique taxonomy covering: 1) threat models; 2) poisoning attacks and defenses against robustness; 3) inference attacks and defenses against privacy, we provide an accessible review of this important topic. We highlight the intuitions, key techniques as well as fundamental assumptions adopted by various attacks and defenses. Finally, we discuss promising future research directions towards robust and privacy-preserving federated learning.

Extensibility · 可辨認的 · CASES · state-of-the-art · MoDELS ·

2020 年 6 月 8 日

Text Detection and Recognition in the Wild: A Review

Zobeir Raisi,Mohamed A. Naiel,Paul Fieguth,Steven Wardell,John Zelek

Detection and recognition of text in natural images are two main problems in the field of computer vision that have a wide variety of applications in analysis of sports videos, autonomous driving, industrial automation, to name a few. They face common challenging problems that are factors in how text is represented and affected by several environmental conditions. The current state-of-the-art scene text detection and/or recognition methods have exploited the witnessed advancement in deep learning architectures and reported a superior accuracy on benchmark datasets when tackling multi-resolution and multi-oriented text. However, there are still several remaining challenges affecting text in the wild images that cause existing methods to underperform due to there models are not able to generalize to unseen data and the insufficient labeled data. Thus, unlike previous surveys in this field, the objectives of this survey are as follows: first, offering the reader not only a review on the recent advancement in scene text detection and recognition, but also presenting the results of conducting extensive experiments using a unified evaluation framework that assesses pre-trained models of the selected methods on challenging cases, and applies the same evaluation criteria on these techniques. Second, identifying several existing challenges for detecting or recognizing text in the wild images, namely, in-plane-rotation, multi-oriented and multi-resolution text, perspective distortion, illumination reflection, partial occlusion, complex fonts, and special characters. Finally, the paper also presents insight into the potential research directions in this field to address some of the mentioned challenges that are still encountering scene text detection and recognition techniques.