唯美清纯另类亚洲一区二区_国产真实乱人伦视频在线观看_WWW国产亚洲精品久久久日本_成人午夜视频免费_色婷婷亚洲四月婷婷中文字幕_中文字幕黄色嫩草影院_又大又爽又粗又硬又黄的免费视频

from arxiv, 44th International Conference on Software Engineering (ICSE 2022) - Software Engineering in Society (SEIS) Track, May 2022, Pittsburgh, MA, United States

Gender imbalance is a well-known phenomenon observed throughout sciences which is particularly severe in software development and Free/Open Source Software communities. Little is know yet about the geography of this phenomenon in particular when considering large scales for both its time and space dimensions. We contribute to fill this gap with a longitudinal study of the population of contributors to publicly available software source code. We analyze the development history of 160 million software projects for a total of 2.2 billion commits contributed by 43 million distinct authors over a period of 50 years. We classify author names by gender using name frequencies and author geographical locations using heuristics based on email addresses and time zones. We study the evolution over time of contributions to public code by gender and by world region. For the world overall, we confirm previous findings about the low but steadily increasing ratio of contributions by female authors. When breaking down by world regions we find that the long-term growth of female participation is a worldwide phenomenon. We also observe a decrease in the ratio of female participation during the COVID-19 pandemic, suggesting that women's ability to contribute to public code has been more hindered than that of men.

相關內容

COVID-19

關注 0

語音合成 · 講稿 · 話題 · MoDELS · 穩健性 ·

2022 年 4 月 20 日

KazakhTTS2: Extending the Open-Source Kazakh TTS Corpus With More Data, Speakers, and Topics

Saida Mussakhojayeva,Yerbolat Khassanov,Huseyin Atakan Varol

from arxiv, 8 pages, 2 figures, 5 tables, accepted to LREC 2022

We present an expanded version of our previously released Kazakh text-to-speech (KazakhTTS) synthesis corpus. In the new KazakhTTS2 corpus, the overall size has increased from 93 hours to 271 hours, the number of speakers has risen from two to five (three females and two males), and the topic coverage has been diversified with the help of new sources, including a book and Wikipedia articles. This corpus is necessary for building high-quality TTS systems for Kazakh, a Central Asian agglutinative language from the Turkic family, which presents several linguistic challenges. We describe the corpus construction process and provide the details of the training and evaluation procedures for the TTS system. Our experimental results indicate that the constructed corpus is sufficient to build robust TTS models for real-world applications, with a subjective mean opinion score ranging from 3.6 to 4.2 for all the five speakers. We believe that our corpus will facilitate speech and language research for Kazakh and other Turkic languages, which are widely considered to be low-resource due to the limited availability of free linguistic data. The constructed corpus, code, and pretrained models are publicly available in our GitHub repository.

INFORMS · COVID-19 · 可辨認的 · Processing（編程語言） · 設計 ·

2022 年 4 月 19 日

Where Was COVID-19 First Discovered? Designing a Question-Answering System for Pandemic Situations

Johannes Graf,Gino Lancho,Patrick Zschech,Kai Heinrich

from arxiv, Preprint accepted for archival and presentation at the 30th European Conference on Information Systems (ECIS 2022)

The COVID-19 pandemic is accompanied by a massive "infodemic" that makes it hard to identify concise and credible information for COVID-19-related questions, like incubation time, infection rates, or the effectiveness of vaccines. As a novel solution, our paper is concerned with designing a question-answering system based on modern technologies from natural language processing to overcome information overload and misinformation in pandemic situations. To carry out our research, we followed a design science research approach and applied Ingwersen's cognitive model of information retrieval interaction to inform our design process from a socio-technical lens. On this basis, we derived prescriptive design knowledge in terms of design requirements and design principles, which we translated into the construction of a prototypical instantiation. Our implementation is based on the comprehensive CORD-19 dataset, and we demonstrate our artifact's usefulness by evaluating its answer quality based on a sample of COVID-19 questions labeled by biomedical experts.

contrastive · 對比學習 · Performer · 學成 · Extensibility ·

2022 年 4 月 19 日

Detect Rumors in Microblog Posts for Low-Resource Domains via Adversarial Contrastive Learning

Hongzhan Lin,Jing Ma,Liangliang Chen,Zhiwei Yang,Mingfei Cheng,Guang Chen

from arxiv, The first study for the low-resource rumor detection on social media in cross-domain and cross-lingual settings

Massive false rumors emerging along with breaking news or trending topics severely hinder the truth. Existing rumor detection approaches achieve promising performance on the yesterday's news, since there is enough corpus collected from the same domain for model training. However, they are poor at detecting rumors about unforeseen events especially those propagated in different languages due to the lack of training data and prior knowledge (i.e., low-resource regimes). In this paper, we propose an adversarial contrastive learning framework to detect rumors by adapting the features learned from well-resourced rumor data to that of the low-resourced. Our model explicitly overcomes the restriction of domain and/or language usage via language alignment and a novel supervised contrastive training paradigm. Moreover, we develop an adversarial augmentation mechanism to further enhance the robustness of low-resource rumor representation. Extensive experiments conducted on two low-resource datasets collected from real-world microblog platforms demonstrate that our framework achieves much better performance than state-of-the-art methods and exhibits a superior capacity for detecting rumors at early stages.

INFORMS · COVID-19 · Engineering · CASE · INTERACT ·

2022 年 4 月 19 日

Software Engineers Response to Public Crisis: Lessons Learnt from Spontaneously Building an Informative COVID-19 Dashboard

Han Wang,Chao Wu,Chunyang Chen,Burak Turhan,Shiping Chen,Jon Whittle

The Coronavirus disease 2019 (COVID-19) outbreak quickly spread around the world, resulting in over 240 million infections and 4 million deaths by Oct 2021. While the virus is spreading from person to person silently, fear has also been spreading around the globe. The COVID-19 information from the Australian Government is convincing but not timely or detailed, and there is much information on social networks with both facts and rumors. As software engineers, we have spontaneously and rapidly constructed a COVID-19 information dashboard aggregating reliable information semi-automatically checked from different sources for providing one-stop information sharing site about the latest status in Australia. Inspired by the John Hopkins University COVID-19 Map, our dashboard contains the case statistics, case distribution, government policy, latest news, with interactive visualization. In this paper, we present a participant's in-person observations in which the authors acted as founders of //covid-19-au.com/ serving more than 830K users with 14M page views since March 2020. According to our first-hand experience, we summarize 9 lessons for developers, researchers and instructors. These lessons may inspire the development, research and teaching in software engineer aspects for coping with similar public crises in the future.

INTERACT · 教程 · 回合 · Integration · 學習器 ·

2022 年 4 月 19 日

ITSS: Interactive Web-Based Authoring and Playback Integrated Environment for Programming Tutorials

Eng Lieh Ouh,Benjamin Kok Siew Gan,David Lo

Video-based programming tutorials are a popular form of tutorial used by authors to guide learners to code. Still, the interactivity of these videos is limited primarily to control video flow. There are existing works with increased interactivity that are shown to improve the learning experience. Still, these solutions require setting up a custom recording environment and are not well-integrated with the playback environment. This paper describes our integrated ITSS environment and evaluates the ease of authoring and playback of our interactive programming tutorials. Our environment is designed to run within the browser sandbox and is less intrusive to record interactivity actions. We develop a recording approach that tracks the author's interactivity actions (e.g., typing code, highlighting words, scrolling panels) on the browser and stored in text and audio formats. We replay these actions using the recorded artefacts for learners to have a more interactive, integrated and realistic playback of the author's actions instead of watching video frames. Our design goals are 1) efficient recording and playback, 2) extensible interactivity features to help students learn better, and 3) a scalable web-based environment. Our first user study of 20 participants who carry out the author tasks agree that it is efficient and easy to author interactive videos in our environment with no additional software needed. Our second user study of 84 students using the environment agrees that the increased interactivity can help them learn better over a video-based tutorial. Our performance test shows that the environment can scale to support up to 500 concurrent users. We hope our open-source environment enable more educators to create interactive programming tutorials.

回合 · 知識 (knowledge) · 情景 · 可辨認的 · 數據集 ·

2022 年 4 月 18 日

Spot the Difference: A Novel Task for Embodied Agents in Changing Environments

Federico Landi,Roberto Bigazzi,Marcella Cornia,Silvia Cascianelli,Lorenzo Baraldi,Rita Cucchiara

from arxiv, Accepted by 26TH International Conference on Pattern Recognition (ICPR 2022)

Embodied AI is a recent research area that aims at creating intelligent agents that can move and operate inside an environment. Existing approaches in this field demand the agents to act in completely new and unexplored scenes. However, this setting is far from realistic use cases that instead require executing multiple tasks in the same environment. Even if the environment changes over time, the agent could still count on its global knowledge about the scene while trying to adapt its internal representation to the current state of the environment. To make a step towards this setting, we propose Spot the Difference: a novel task for Embodied AI where the agent has access to an outdated map of the environment and needs to recover the correct layout in a fixed time budget. To this end, we collect a new dataset of occupancy maps starting from existing datasets of 3D spaces and generating a number of possible layouts for a single environment. This dataset can be employed in the popular Habitat simulator and is fully compliant with existing methods that employ reconstructed occupancy maps during navigation. Furthermore, we propose an exploration policy that can take advantage of previous knowledge of the environment and identify changes in the scene faster and more effectively than existing agents. Experimental results show that the proposed architecture outperforms existing state-of-the-art models for exploration on this new setting.

近似 · MoDELS · INFORMS · 情景 · Performer ·

2022 年 4 月 17 日

Interdependent Public Projects

Avi Cohen,Michal Feldman,Divyarthi Mohan,Inbal Talgam-Cohen

In the interdependent values (IDV) model introduced by Milgrom and Weber [1982], agents have private signals that capture their information about different social alternatives, and the valuation of every agent is a function of all agent signals. While interdependence has been mainly studied for auctions, it is extremely relevant for a large variety of social choice settings, including the canonical setting of public projects. The IDV model is very challenging relative to standard independent private values, and welfare guarantees have been achieved through two alternative conditions known as {\em single-crossing} and {\em submodularity over signals (SOS)}. In either case, the existing theory falls short of solving the public projects setting. Our contribution is twofold: (i) We give a workable characterization of truthfulness for IDV public projects for the largest class of valuations for which such a characterization exists, and term this class \emph{decomposable valuations}; (ii) We provide possibility and impossibility results for welfare approximation in public projects with SOS valuations. Our main impossibility result is that, in contrast to auctions, no universally truthful mechanism performs better for public projects with SOS valuations than choosing a project at random. Our main positive result applies to {\em excludable} public projects with SOS, for which we establish a constant factor approximation similar to auctions. Our results suggest that exclusion may be a key tool for achieving welfare guarantees in the IDV model.

HCI · 有偏 · 模型評估 · Facebook AI Research · 注意力機制 ·

2022 年 4 月 17 日

Using HCI to Tackle Race and Gender Bias in ADHD Diagnosis

Naba Rizvi,Khalil Mrini

from arxiv, 6 pages, CHI 2020 workshop submission

Attention Deficit Hyperactivity Disorder (ADHD) is a behavioral disorder that impacts an individual's education, relationships, career, and ability to acquire fair and just police interrogations. Yet, traditional methods used to diagnose ADHD in children and adults are known to have racial and gender bias. In recent years, diagnostic technology has been studied by both HCI and ML researchers. However, these studies fail to take into consideration racial and gender stereotypes that may impact the accuracy of their results. We highlight the importance of taking race and gender into consideration when creating diagnostic technology for ADHD and provide HCI researchers with suggestions for future studies.

任務對話系統 · 潛變量/隱變量 · 講稿 · 正則化項 · state-of-the-art ·

2022 年 4 月 15 日

Towards Building a Personalized Dialogue Generator via Implicit User Persona Detection

Itsugun Cho,Dongyang Wang,Ryota Takahashi,Hiroaki Saito

from arxiv, 7 pages, 6 figures, conference, no conference submit

Current works in the generation of personalized dialogue primarily contribute to the agent avoiding contradictory persona and driving the response more informative. However, we found that the generated responses from these models are mostly self-centered with little care for the other party since they ignore the user's persona. Moreover, we consider high-quality transmission is essentially built based on apprehending the persona of the other party. Motivated by this, we propose a novel personalized dialogue generator by detecting implicit user persona. Because it's difficult to collect a large number of personas for each user, we attempt to model the user's potential persona and its representation from the dialogue absence of any external information. Perception variable and fader variable are conceived utilizing Conditional Variational Inference. The two latent variables simulate the process of people being aware of the other party's persona and producing the corresponding expression in conversation. Finally, Posterior-discriminated Regularization is presented to enhance the training procedure. Empirical studies demonstrate that compared with the state-of-the-art methods, ours is more concerned with the user's persona and outperforms in evaluations.

可理解性 · Performer · 秩 · 變換 · Better ·

2022 年 4 月 5 日

How Different are Pre-trained Transformers for Text Ranking?

David Rau,Jaap Kamps

from arxiv, ECIR 2022

In recent years, large pre-trained transformers have led to substantial gains in performance over traditional retrieval models and feedback approaches. However, these results are primarily based on the MS Marco/TREC Deep Learning Track setup, with its very particular setup, and our understanding of why and how these models work better is fragmented at best. We analyze effective BERT-based cross-encoders versus traditional BM25 ranking for the passage retrieval task where the largest gains have been observed, and investigate two main questions. On the one hand, what is similar? To what extent does the neural ranker already encompass the capacity of traditional rankers? Is the gain in performance due to a better ranking of the same documents (prioritizing precision)? On the other hand, what is different? Can it retrieve effectively documents missed by traditional systems (prioritizing recall)? We discover substantial differences in the notion of relevance identifying strengths and weaknesses of BERT that may inspire research for future improvement. Our results contribute to our understanding of (black-box) neural rankers relative to (well-understood) traditional rankers, help understand the particular experimental setting of MS-Marco-based test collections.