欧美丰满大乳屁股流白浆-国产小鲜肉顾泽宇GV

Given Myanmars historical and socio-political context, hate speech spread on social media has escalated into offline unrest and violence. This paper presents findings from our remote study on the automatic detection of hate speech online in Myanmar. We argue that effectively addressing this problem will require community-based approaches that combine the knowledge of context experts with machine learning tools that can analyze the vast amount of data produced. To this end, we develop a systematic process to facilitate this collaboration covering key aspects of data collection, annotation, and model validation strategies. We highlight challenges in this area stemming from small and imbalanced datasets, the need to balance non-glamorous data work and stakeholder priorities, and closed data-sharing practices. Stemming from these findings, we discuss avenues for further work in developing and deploying hate speech detection systems for low-resource languages.

相關內容

機器學習工具

關注 2

工欲善其事，必先利其器，想要學習機器學習，那么首先我們就由機器學習的必備工具說起。

Prompt · 任務對話系統 · Extensibility · Performer · AIM ·

2023 年 5 月 19 日

Chain-of-thought prompting for responding to in-depth dialogue questions with LLM

Hongru Wang,Rui Wang,Fei Mi,Zezhong Wang,Ruifeng Xu,Kam-Fai Wong

The way and content in which users ask questions can provide insight into their current status, including their personality, emotions, and psychology. Instead of directly prompting the large language models (LLMs), we explore how chain-of-thought prompting helps in this scenario to perform reasoning and planning according to user status, aiming to provide a more personalized and engaging experience for the user query. To this end, we first construct a benchmark of 6 dialogue or question-answering datasets in both English and Chinese, covering 3 different aspects of user status (\textit{including} \textit{personality}, \textit{emotion}, and \textit{psychology}). Then we prompt the LLMs to generate the response regarding the user status as intermediate reasoning processing. We propose a novel demonstration selection strategy using the semantic similarity of intermediate reasoning instead of test queries. To evaluate the effectiveness and robustness of our approach, we conduct extensive experiments with 7 LLMs under zero-shot and one-shot settings. The experimental results show that our approach consistently outperforms standard prompting in terms of both \textit{helpfulness} and \textit{acceptness} across all datasets, regardless of the LLMs used. The code and dataset can be found at \url{//github.com/ruleGreen/Dialogue\_CoT.git}.

推薦系統 · 可辨認的 · AIM · 可理解性 · Better ·

2023 年 5 月 19 日

Visualization for Recommendation Explainability: A Survey and New Perspectives

Mohamed Amine Chatti,Mouadh Guesmi,Arham Muslim

from arxiv, 36 pages

Providing system-generated explanations for recommendations represents an important step towards transparent and trustworthy recommender systems. Explainable recommender systems provide a human-understandable rationale for their outputs. Over the last two decades, explainable recommendation has attracted much attention in the recommender systems research community. This paper aims to provide a comprehensive review of research efforts on visual explanation in recommender systems. More concretely, we systematically review the literature on explanations in recommender systems based on four dimensions, namely explanation goal, explanation scope, explanation style, and explanation format. Recognizing the importance of visualization, we approach the recommender system literature from the angle of explanatory visualizations, that is using visualizations as a display style of explanation. As a result, we derive a set of guidelines that might be constructive for designing explanatory visualizations in recommender systems and identify perspectives for future work in this field. The aim of this review is to help recommendation researchers and practitioners better understand the potential of visually explainable recommendation research and to support them in the systematic design of visual explanations in current and future recommender systems.

Learning · 可辨認的 · Extensibility · Taxonomy · MoDELS ·

2023 年 5 月 19 日

Trustworthy Federated Learning: A Survey

Asadullah Tariq,Mohamed Adel Serhani,Farag Sallabi,Tariq Qayyum,Ezedin S. Barka,Khaled A. Shuaib

from arxiv, 45 Pages, 8 Figures, 9 Tables

Federated Learning (FL) has emerged as a significant advancement in the field of Artificial Intelligence (AI), enabling collaborative model training across distributed devices while maintaining data privacy. As the importance of FL increases, addressing trustworthiness issues in its various aspects becomes crucial. In this survey, we provide an extensive overview of the current state of Trustworthy FL, exploring existing solutions and well-defined pillars relevant to Trustworthy . Despite the growth in literature on trustworthy centralized Machine Learning (ML)/Deep Learning (DL), further efforts are necessary to identify trustworthiness pillars and evaluation metrics specific to FL models, as well as to develop solutions for computing trustworthiness levels. We propose a taxonomy that encompasses three main pillars: Interpretability, Fairness, and Security & Privacy. Each pillar represents a dimension of trust, further broken down into different notions. Our survey covers trustworthiness challenges at every level in FL settings. We present a comprehensive architecture of Trustworthy FL, addressing the fundamental principles underlying the concept, and offer an in-depth analysis of trust assessment mechanisms. In conclusion, we identify key research challenges related to every aspect of Trustworthy FL and suggest future research directions. This comprehensive survey serves as a valuable resource for researchers and practitioners working on the development and implementation of Trustworthy FL systems, contributing to a more secure and reliable AI landscape.

語音識別 · MoDELS · 端到端 · Performer · 前饋 ·

2023 年 5 月 18 日

FunASR: A Fundamental End-to-End Speech Recognition Toolkit

Zhifu Gao,Zerui Li,Jiaming Wang,Haoneng Luo,Xian Shi,Mengzhe Chen,Yabin Li,Lingyun Zuo,Zhihao Du,Zhangyu Xiao,Shiliang Zhang

from arxiv, 5 pages, 3 figures, accepted by INTERSPEECH 2023

This paper introduces FunASR, an open-source speech recognition toolkit designed to bridge the gap between academic research and industrial applications. FunASR offers models trained on large-scale industrial corpora and the ability to deploy them in applications. The toolkit's flagship model, Paraformer, is a non-autoregressive end-to-end speech recognition model that has been trained on a manually annotated Mandarin speech recognition dataset that contains 60,000 hours of speech. To improve the performance of Paraformer, we have added timestamp prediction and hotword customization capabilities to the standard Paraformer backbone. In addition, to facilitate model deployment, we have open-sourced a voice activity detection model based on the Feedforward Sequential Memory Network (FSMN-VAD) and a text post-processing punctuation model based on the controllable time-delay Transformer (CT-Transformer), both of which were trained on industrial corpora. These functional modules provide a solid foundation for building high-precision long audio speech recognition services. Compared to other models trained on open datasets, Paraformer demonstrates superior performance.

語義分析 · Integration · 可理解性 · 知識 (knowledge) · 可辨認的 ·

2023 年 5 月 18 日

The Role of Semantic Parsing in Understanding Procedural Text

Hossein Rajaby Faghihi,Parisa Kordjamshidi,Choh Man Teng,James Allen

from arxiv, 9 pages, Appected in EACL2023

In this paper, we investigate whether symbolic semantic representations, extracted from deep semantic parsers, can help reasoning over the states of involved entities in a procedural text. We consider a deep semantic parser~(TRIPS) and semantic role labeling as two sources of semantic parsing knowledge. First, we propose PROPOLIS, a symbolic parsing-based procedural reasoning framework. Second, we integrate semantic parsing information into state-of-the-art neural models to conduct procedural reasoning. Our experiments indicate that explicitly incorporating such semantic knowledge improves procedural understanding. This paper presents new metrics for evaluating procedural reasoning tasks that clarify the challenges and identify differences among neural, symbolic, and integrated models.

可理解性 · Branch · 估計/估計量 · TOOLS · 推斷 ·

2023 年 5 月 17 日

A Survey on Causal Discovery: Theory and Practice

Alessio Zanga,Fabio Stella

Understanding the laws that govern a phenomenon is the core of scientific progress. This is especially true when the goal is to model the interplay between different aspects in a causal fashion. Indeed, causal inference itself is specifically designed to quantify the underlying relationships that connect a cause to its effect. Causal discovery is a branch of the broader field of causality in which causal graphs is recovered from data (whenever possible), enabling the identification and estimation of causal effects. In this paper, we explore recent advancements in a unified manner, provide a consistent overview of existing algorithms developed under different settings, report useful tools and data, present real-world applications to understand why and how these methods can be fruitfully exploited.

知識 (knowledge) · 命名實體識別 · entity · Performer · MoDELS ·

2023 年 5 月 17 日

DAMO-NLP at SemEval-2023 Task 2: A Unified Retrieval-augmented System for Multilingual Named Entity Recognition

Zeqi Tan,Shen Huang,Zixia Jia,Jiong Cai,Yinghui Li,Weiming Lu,Yueting Zhuang,Kewei Tu,Pengjun Xie,Fei Huang,Yong Jiang

from arxiv, Accepted to SemEval 2023, winners for 9 out of 13 tracks, performance beyond ChatGPT

The MultiCoNER \RNum{2} shared task aims to tackle multilingual named entity recognition (NER) in fine-grained and noisy scenarios, and it inherits the semantic ambiguity and low-context setting of the MultiCoNER \RNum{1} task. To cope with these problems, the previous top systems in the MultiCoNER \RNum{1} either incorporate the knowledge bases or gazetteers. However, they still suffer from insufficient knowledge, limited context length, single retrieval strategy. In this paper, our team \textbf{DAMO-NLP} proposes a unified retrieval-augmented system (U-RaNER) for fine-grained multilingual NER. We perform error analysis on the previous top systems and reveal that their performance bottleneck lies in insufficient knowledge. Also, we discover that the limited context length causes the retrieval knowledge to be invisible to the model. To enhance the retrieval context, we incorporate the entity-centric Wikidata knowledge base, while utilizing the infusion approach to broaden the contextual scope of the model. Also, we explore various search strategies and refine the quality of retrieval knowledge. Our system\footnote{We will release the dataset, code, and scripts of our system at {\small \url{//github.com/modelscope/AdaSeq/tree/master/examples/U-RaNER}}.} wins 9 out of 13 tracks in the MultiCoNER \RNum{2} shared task. Additionally, we compared our system with ChatGPT, one of the large language models which have unlocked strong capabilities on many tasks. The results show that there is still much room for improvement for ChatGPT on the extraction task.

Extensibility · 噪聲 · Performer · state-of-the-art · 學成 ·

2021 年 6 月 30 日

Affective Image Content Analysis: Two Decades Review and New Perspectives

Sicheng Zhao,Xingxu Yao,Jufeng Yang,Guoli Jia,Guiguang Ding,Tat-Seng Chua,Bj?rn W. Schuller,Kurt Keutzer

from arxiv, Accepted by IEEE TPAMI

Images can convey rich semantics and induce various emotions in viewers. Recently, with the rapid advancement of emotional intelligence and the explosive growth of visual data, extensive research efforts have been dedicated to affective image content analysis (AICA). In this survey, we will comprehensively review the development of AICA in the recent two decades, especially focusing on the state-of-the-art methods with respect to three main challenges -- the affective gap, perception subjectivity, and label noise and absence. We begin with an introduction to the key emotion representation models that have been widely employed in AICA and description of available datasets for performing evaluation with quantitative comparison of label noise and dataset bias. We then summarize and compare the representative approaches on (1) emotion feature extraction, including both handcrafted and deep features, (2) learning methods on dominant emotion recognition, personalized emotion prediction, emotion distribution learning, and learning from noisy data or few labels, and (3) AICA based applications. Finally, we discuss some challenges and promising research directions in the future, such as image content and context understanding, group emotion clustering, and viewer-image interaction.

entity · 命名實體識別 · CASE · Extensibility · MoDELS ·

2019 年 11 月 14 日

Enhanced Meta-Learning for Cross-lingual Named Entity Recognition with Minimal Resources

Qianhui Wu,Zijia Lin,Guoxin Wang,Hui Chen,B?rje F. Karlsson,Biqing Huang,Chin-Yew Lin

from arxiv, This paper is accepted by AAAI2020

For languages with no annotated resources, transferring knowledge from rich-resource languages is an effective solution for named entity recognition (NER). While all existing methods directly transfer from source-learned model to a target language, in this paper, we propose to fine-tune the learned model with a few similar examples given a test case, which could benefit the prediction by leveraging the structural and semantic information conveyed in such similar examples. To this end, we present a meta-learning algorithm to find a good model parameter initialization that could fast adapt to the given test case and propose to construct multiple pseudo-NER tasks for meta-training by computing sentence similarities. To further improve the model's generalization ability across different languages, we introduce a masking scheme and augment the loss function with an additional maximum term during meta-training. We conduct extensive experiments on cross-lingual named entity recognition with minimal resources over five target languages. The results show that our approach significantly outperforms existing state-of-the-art methods across the board.

圖 · DNN · 樣例 · 學成 · Machine Learning ·

2019 年 10 月 9 日

Adversarial Attacks and Defenses in Images, Graphs and Text: A Review

Han Xu,Yao Ma,Haochen Liu,Debayan Deb,Hui Liu,Jiliang Tang,Anil K. Jain

from arxiv, survey, adversarial attacks, defenses

Deep neural networks (DNN) have achieved unprecedented success in numerous machine learning tasks in various domains. However, the existence of adversarial examples has raised concerns about applying deep learning to safety-critical applications. As a result, we have witnessed increasing interests in studying attack and defense mechanisms for DNN models on different data types, such as images, graphs and text. Thus, it is necessary to provide a systematic and comprehensive overview of the main threats of attacks and the success of corresponding countermeasures. In this survey, we review the state of the art algorithms for generating adversarial examples and the countermeasures against adversarial examples, for the three popular data types, i.e., images, graphs and text.