日本人体黄色三级视频_日本一区二区三区免费在线观看_美女扒开尿孔全身100%裸露_青青视频在线观看免费2_久久久久久精品妓女影院妓女网_高清无码中文字幕天堂_天天操天天插天天日

Wikipedia is one of the most successful collaborative projects in history. It is the largest encyclopedia ever created, with millions of users worldwide relying on it as the first source of information as well as for fact-checking and in-depth research. As Wikipedia relies solely on the efforts of its volunteer-editors, its success might be particularly affected by toxic speech. In this paper, we analyze all 57 million comments made on user talk pages of 8.5 million editors across the six most active language editions of Wikipedia to study the potential impact of toxicity on editors' behaviour. We find that toxic comments consistently reduce the activity of editors, leading to an estimated loss of 0.5-2 active days per user in the short term. This amounts to multiple human-years of lost productivity when considering the number of active contributors to Wikipedia. The effects of toxic comments are even greater in the long term, as they significantly increase the risk of editors leaving the project altogether. Using an agent-based model, we demonstrate that toxicity attacks on Wikipedia have the potential to impede the progress of the entire project. Our results underscore the importance of mitigating toxic speech on collaborative platforms such as Wikipedia to ensure their continued success.

相關內容

維基百科

關注 10

維基百科（）是一個基于 Wiki 技術的全球性多語言百科全書協作項目，同時也是一部在網際網絡上呈現的網絡百科全書網站，其目標及宗旨是為全人類提供自由的百科全書。目前 Alexa 全球網站排名第六。

INTERACT · 相關系數 · 振蕩 · MoDELS · 可辨認的 ·

2023 年 6 月 12 日

Species interactions reproduce abundance correlation patterns in microbial communities

José Camacho-Mateu,Aniello Lampo,Matteo Sireci,Miguel ángel Mu?oz,José A. Cuesta

During the last decades macroecology has identified broad-scale patterns of abundances and diversity of microbial communities and put forward some potential explanations for them. However, these advances are not paralleled by a full understanding of the underlying dynamical processes. In particular, abundance fluctuations over metagenomic samples are found to be correlated, but reproducing these through appropriate models remains still an open task. The present paper tackles this problem and points to species interactions as a necessary mechanism to account for them. Specifically, we discuss several possibilities to include interactions in population models and recognize Lotka-Volterra constants as successful ansatz. We design a Bayesian inference algorithm to obtain sets of interaction constants able to reproduce the experimental correlation distributions much better than the state-of-the-art attempts. Importantly, the model still reproduces single-species, experimental, macroecological patterns previously detected in the literature, concerning the abundance fluctuations across both species and communities. Endorsed by the agreement with the observed phenomenology, our analysis provides insights on the properties of microbial interactions, and suggests their sparsity as a necessary feature to balance the emergence of different patterns.

Chatbot · Processing（編程語言） · INTERACT · 語言模型化 · 可理解性 ·

2023 年 6 月 9 日

Towards the Exploitation of LLM-based Chatbot for Providing Legal Support to Palestinian Cooperatives

Rabee Qasem,Banan Tantour,Mohammed Maree

With the ever-increasing utilization of natural language processing (NLP), we started to witness over the past few years a significant transformation in our interaction with legal texts. This technology has advanced the analysis and enhanced the understanding of complex legal terminology and contexts. The development of recent large language models (LLMs), particularly ChatGPT, has also introduced a revolutionary contribution to the way that legal texts can be processed and comprehended. In this paper, we present our work on a cooperative-legal question-answering LLM-based chatbot, where we developed a set of legal questions about Palestinian cooperatives, associated with their regulations and compared the auto-generated answers by the chatbot to their correspondences that are designed by a legal expert. To evaluate the proposed chatbot, we have used 50 queries generated by the legal expert and compared the answers produced by the chart to their relevance judgments. Finding demonstrated that an overall accuracy rate of 82% has been achieved when answering the queries, while exhibiting an F1 score equivalent to 79%.

可辨認的 · 設計 · Google Home · INFORMS · Integration ·

2023 年 6 月 9 日

Challenges and Opportunities for the Design of Smart Speakers

Tao Long,Lydia B. Chilton

from arxiv, 15 pages, 7 figures

Advances in voice technology and voice user interfaces (VUIs) -- such as Alexa, Siri, and Google Home -- have opened up the potential for many new types of interaction. However, despite the potential of these devices reflected by the growing market and body of VUI research, there is a lingering sense that the technology is still underused. In this paper, we conducted a systematic literature review of 35 papers to identify and synthesize 127 VUI design guidelines into five themes. Additionally, we conducted semi-structured interviews with 15 smart speaker users to understand their use and non-use of the technology. From the interviews, we distill four design challenges that contribute the most to non-use. Based on their (non-)use, we identify four opportunity spaces for designers to explore such as focusing on information support while multitasking (cooking, driving, childcare, etc), incorporating users' mental models for smart speakers, and integrating calm design principles.

Automator · Ad hoc · Analysis · 代碼 · Seven ·

2023 年 6 月 8 日

A Systematic Review of Automated Query Reformulations in Source Code Search

Mohammad Masudur Rahman,Chanchal K. Roy

from arxiv, 81 pages, accepted at TOSEM

Fixing software bugs and adding new features are two of the major maintenance tasks. Software bugs and features are reported as change requests. Developers consult these requests and often choose a few keywords from them as an ad hoc query. Then they execute the query with a search engine to find the exact locations within software code that need to be changed. Unfortunately, even experienced developers often fail to choose appropriate queries, which leads to costly trials and errors during a code search. Over the years, many studies attempt to reformulate the ad hoc queries from developers to support them. In this systematic literature review, we carefully select 70 primary studies on query reformulations from 2,970 candidate studies, perform an in-depth qualitative analysis (e.g., Grounded Theory), and then answer seven research questions with major findings. First, to date, eight major methodologies (e.g., term weighting, term co-occurrence analysis, thesaurus lookup) have been adopted to reformulate queries. Second, the existing studies suffer from several major limitations (e.g., lack of generalizability, vocabulary mismatch problem, subjective bias) that might prevent their wide adoption. Finally, we discuss the best practices and future opportunities to advance the state of research in search query reformulations.

多峰值 · MoDELS · OCR · INFORMS · Performer ·

2023 年 6 月 8 日

On the Hidden Mystery of OCR in Large Multimodal Models

Yuliang Liu,Zhang Li,Hongliang Li,Wenwen Yu,Mingxin Huang,Dezhi Peng,Mingyu Liu,Mingrui Chen,Chunyuan Li,Cheng-lin Liu,Lianwen Jin,Xiang Bai

Large models have recently played a dominant role in natural language processing and multimodal vision-language learning. It remains less explored about their efficacy in text-related visual tasks. We conducted a comprehensive study of existing publicly available multimodal models, evaluating their performance in text recognition (document text, artistic text, handwritten text, scene text), text-based visual question answering (document text, scene text, and bilingual text), key information extraction (receipts, documents, and nutrition facts) and handwritten mathematical expression recognition. Our findings reveal strengths and weaknesses in these models, which primarily rely on semantic understanding for word recognition and exhibit inferior perception of individual character shapes. They also display indifference towards text length and have limited capabilities in detecting finegrained features in images. Consequently, these results demonstrate that even the current most powerful large multimodal models cannot match domain-specific methods in traditional text tasks and face greater challenges in more complex tasks. Most importantly, the baseline results showcased in this study could provide a foundational framework for the conception and assessment of innovative strategies targeted at enhancing zero-shot multimodal techniques. Evaluation pipeline is available at //github.com/Yuliang-Liu/MultimodalOCR.

INTERACT · AI · 道德化 · INFORMS · 有向 ·

2023 年 6 月 7 日

Artificial Intelligence can facilitate selfish decisions by altering the appearance of interaction partners

Nils K?bis,Philipp Lorenz-Spreen,Tamer Ajaj,Jean-Francois Bonnefon,Ralph Hertwig,Iyad Rahwan

The increasing prevalence of image-altering filters on social media and video conferencing technologies has raised concerns about the ethical and psychological implications of using Artificial Intelligence (AI) to manipulate our perception of others. In this study, we specifically investigate the potential impact of blur filters, a type of appearance-altering technology, on individuals' behavior towards others. Our findings consistently demonstrate a significant increase in selfish behavior directed towards individuals whose appearance is blurred, suggesting that blur filters can facilitate moral disengagement through depersonalization. These results emphasize the need for broader ethical discussions surrounding AI technologies that modify our perception of others, including issues of transparency, consent, and the awareness of being subject to appearance manipulation by others. We also emphasize the importance of anticipatory experiments in informing the development of responsible guidelines and policies prior to the widespread adoption of such technologies.

MoDELS · Processing（編程語言） · 容差 · 查準率/準確率 · 可理解性 ·

2023 年 6 月 7 日

A Model and Survey of Distributed Data-Intensive Systems

Alessandro Margara,Gianpaolo Cugola,Nicolò Felicioni,Stefano Cilloni

from arxiv, Preprint, submitted to ACM Computing Surveys on March 2022, revised on November 2022 and April 2023, accepted May 2023

Data is a precious resource in today's society, and is generated at an unprecedented and constantly growing pace. The need to store, analyze, and make data promptly available to a multitude of users introduces formidable challenges in modern software platforms. These challenges radically transformed all research fields that gravitate around data management and processing, with the introduction of distributed data-intensive systems that offer new programming models and implementation strategies to handle data characteristics such as its volume, the rate at which it is produced, its heterogeneity, and its distribution. Each data-intensive system brings its specific choices in terms of data model, usage assumptions, synchronization, processing strategy, deployment, guarantees in terms of consistency, fault tolerance, ordering. Yet, the problems data-intensive systems face and the solutions they propose are frequently overlapping. This paper proposes a unifying model that dissects the core functionalities of data-intensive systems, and precisely discusses alternative design and implementation strategies, pointing out their assumptions and implications. The model offers a common ground to understand and compare highly heterogeneous solutions, with the potential of fostering cross-fertilization across research communities and advancing the field. We apply our model by classifying tens of systems: an exercise that brings to interesting observations on the current trends in the domain of data-intensive systems and suggests open research directions.

INTERACT · Extensibility · 學成 · MoDELS · 數據集 ·

2020 年 3 月 10 日

Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog

Shen Gao,Xiuying Chen,Chang Liu,Li Liu,Dongyan Zhao,Rui Yan

from arxiv, Accepted by The Web Conference 2020 (WWW 2020). Equal contribution from first two authors. Dataset and code are released at //github.com/gsh199449/stickerchat

Stickers with vivid and engaging expressions are becoming increasingly popular in online messaging apps, and some works are dedicated to automatically select sticker response by matching text labels of stickers with previous utterances. However, due to their large quantities, it is impractical to require text labels for the all stickers. Hence, in this paper, we propose to recommend an appropriate sticker to user based on multi-turn dialog context history without any external labels. Two main challenges are confronted in this task. One is to learn semantic meaning of stickers without corresponding text labels. Another challenge is to jointly model the candidate sticker with the multi-turn dialog context. To tackle these challenges, we propose a sticker response selector (SRS) model. Specifically, SRS first employs a convolutional based sticker image encoder and a self-attention based multi-turn dialog encoder to obtain the representation of stickers and utterances. Next, deep interaction network is proposed to conduct deep matching between the sticker with each utterance in the dialog history. SRS then learns the short-term and long-term dependency between all interaction results by a fusion network to output the the final matching score. To evaluate our proposed method, we collect a large-scale real-world dialog dataset with stickers from one of the most popular online chatting platform. Extensive experiments conducted on this dataset show that our model achieves the state-of-the-art performance for all commonly-used metrics. Experiments also verify the effectiveness of each component of SRS. To facilitate further research in sticker selection field, we release this dataset of 340K multi-turn dialog and sticker pairs.

樣本 · 類別 · 損失 · Performer · SimPLe ·

2019 年 1 月 16 日

Class-Balanced Loss Based on Effective Number of Samples

Yin Cui,Menglin Jia,Tsung-Yi Lin,Yang Song,Serge Belongie

from arxiv, Code is available at: //github.com/richardaecn/class-balanced-loss

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

Ripple · Networking · 圖 · 知識圖譜 · Extensibility ·

2018 年 5 月 19 日

Ripple Network: Propagating User Preferences on the Knowledge Graph for Recommender Systems

Hongwei Wang,Fuzheng Zhang,Jialin Wang,Miao Zhao,Wenjie Li,Xing Xie,Minyi Guo

To address the sparsity and cold start problem of collaborative filtering, researchers usually make use of side information, such as social networks or item attributes, to improve recommendation performance. This paper considers the knowledge graph as the source of side information. To address the limitations of existing embedding-based and path-based methods for knowledge-graph-aware recommendation, we propose Ripple Network, an end-to-end framework that naturally incorporates the knowledge graph into recommender systems. Similar to actual ripples propagating on the surface of water, Ripple Network stimulates the propagation of user preferences over the set of knowledge entities by automatically and iteratively extending a user's potential interests along links in the knowledge graph. The multiple "ripples" activated by a user's historically clicked items are thus superposed to form the preference distribution of the user with respect to a candidate item, which could be used for predicting the final clicking probability. Through extensive experiments on real-world datasets, we demonstrate that Ripple Network achieves substantial gains in a variety of scenarios, including movie, book and news recommendation, over several state-of-the-art baselines.