清纯唯美另类亚洲欧美综合_欧美日韩大片一区二区三区_久久久一区二区三区_久久免费精品视频_亚洲日本中文字幕一区二区不卡_国产精品热久久高潮AV袁孑怡_国产精品美女WWW爽爽爽

Over the past decade, a crisis of confidence in published scientific findings has catalyzed widespread response from the research community, particularly in the West. These responses have included policy discussions and changes to existing practice as well as computational infrastructure to support and evaluate research. Our work studies Indian researchers' awareness, perceptions, and challenges around research integrity. We explore opportunities for Artificial Intelligence (AI)-powered tools to evaluate reproducibility and replicability, centering cultural perspectives. We discuss requirements for such tools, including signals within papers and metadata to be included, and system hybridity (fully-AI vs. collaborative human-AI). We draw upon 19 semi-structured interviews and 72 follow-up surveys with researchers at universities throughout India. Our findings highlight the need for computational tools to contextualize confidence in published research. In particular, researchers prefer approaches that enable human-AI collaboration. Additionally, our findings emphasize the shortcomings of current incentive structures for publication, funding, and promotion.

相關內容

TOOLS

關注 1

這個新版本的工具會議系列恢復了從1989年到2012年的50個會議的傳統。工具最初是“面向對象語言和系統的技術”，后來發展到包括軟件技術的所有創新方面。今天許多最重要的軟件概念都是在這里首次引入的。2019年TOOLS 50+1在俄羅斯喀山附近舉行，以同樣的創新精神、對所有與軟件相關的事物的熱情、科學穩健性和行業適用性的結合以及歡迎該領域所有趨勢和社區的開放態度，延續了該系列。官網鏈接： · 可理解性 · 線性的 · 互信息 · 分離的 ·

2023 年 12 月 15 日

Understanding Probe Behaviors through Variational Bounds of Mutual Information

Kwanghee Choi,Jee-weon Jung,Shinji Watanabe

from arxiv, Accepted to ICASSP 2024, implementation available at //github.com/juice500ml/information_probing

With the success of self-supervised representations, researchers seek a better understanding of the information encapsulated within a representation. Among various interpretability methods, we focus on classification-based linear probing. We aim to foster a solid understanding and provide guidelines for linear probing by constructing a novel mathematical framework leveraging information theory. First, we connect probing with the variational bounds of mutual information (MI) to relax the probe design, equating linear probing with fine-tuning. Then, we investigate empirical behaviors and practices of probing through our mathematical framework. We analyze the layer-wise performance curve being convex, which seemingly violates the data processing inequality. However, we show that the intermediate representations can have the biggest MI estimate because of the tradeoff between better separability and decreasing MI. We further suggest that the margin of linearly separable representations can be a criterion for measuring the "goodness of representation." We also compare accuracy with MI as the measuring criteria. Finally, we empirically validate our claims by observing the self-supervised speech models on retaining word and phoneme information.

Projection · 回合 · MoDELS · 推斷 · INFORMS ·

2023 年 12 月 14 日

PROPRES: Investigating the Projectivity of Presupposition with Various Triggers and Environments

Daiki Asami,Saku Sugawara

from arxiv, Accepted by the 27th Conference on Computational Natural Language Learning (CoNLL2023)

What makes a presupposition of an utterance -- information taken for granted by its speaker -- different from other pragmatic inferences such as an entailment is projectivity (e.g., the negative sentence the boy did not stop shedding tears presupposes the boy had shed tears before). The projectivity may vary depending on the combination of presupposition triggers and environments. However, prior natural language understanding studies fail to take it into account as they either use no human baseline or include only negation as an entailment-canceling environment to evaluate models' performance. The current study attempts to reconcile these issues. We introduce a new dataset, projectivity of presupposition (PROPRES, which includes 12k premise-hypothesis pairs crossing six triggers involving some lexical variety with five environments. Our human evaluation reveals that humans exhibit variable projectivity in some cases. However, the model evaluation shows that the best-performed model, DeBERTa, does not fully capture it. Our findings suggest that probing studies on pragmatic inferences should take extra care of the human judgment variability and the combination of linguistic items.

有偏 · 置信度 · Learning · Facebook AI Research · 標注 ·

2023 年 12 月 14 日

Mitigating Label Bias in Machine Learning: Fairness through Confident Learning

Yixuan Zhang,Boyu Li,Zenan Ling,Feng Zhou

Discrimination can occur when the underlying unbiased labels are overwritten by an agent with potential bias, resulting in biased datasets that unfairly harm specific groups and cause classifiers to inherit these biases. In this paper, we demonstrate that despite only having access to the biased labels, it is possible to eliminate bias by filtering the fairest instances within the framework of confident learning. In the context of confident learning, low self-confidence usually indicates potential label errors; however, this is not always the case. Instances, particularly those from underrepresented groups, might exhibit low confidence scores for reasons other than labeling errors. To address this limitation, our approach employs truncation of the confidence score and extends the confidence interval of the probabilistic threshold. Additionally, we incorporate with co-teaching paradigm for providing a more robust and reliable selection of fair instances and effectively mitigating the adverse effects of biased labels. Through extensive experimentation and evaluation of various datasets, we demonstrate the efficacy of our approach in promoting fairness and reducing the impact of label bias in machine learning models.

HCI · 設計 · 情景 · ACM · AAAI ·

2023 年 12 月 13 日

The State of Pilot Study Reporting in Crowdsourcing: A Reflection on Best Practices and Guidelines

Jonas Oppenlaender,Tahir Abbas,Ujwal Gadiraju

from arxiv, Accepted at CSCW '24. 45 pages, 17 figures, 1 table

Pilot studies are an essential cornerstone of the design of crowdsourcing campaigns, yet they are often only mentioned in passing in the scholarly literature. A lack of details surrounding pilot studies in crowdsourcing research hinders the replication of studies and the reproduction of findings, stalling potential scientific advances. We conducted a systematic literature review on the current state of pilot study reporting at the intersection of crowdsourcing and HCI research. Our review of ten years of literature included 171 articles published in the proceedings of the Conference on Human Computation and Crowdsourcing (AAAI HCOMP) and the ACM Digital Library. We found that pilot studies in crowdsourcing research (i.e., crowd pilot studies) are often under-reported in the literature. Important details, such as the number of workers and rewards to workers, are often not reported. On the basis of our findings, we reflect on the current state of practice and formulate a set of best practice guidelines for reporting crowd pilot studies in crowdsourcing research. We also provide implications for the design of crowdsourcing platforms and make practical suggestions for supporting crowd pilot study reporting.

有偏 · Extensibility · Continuity · GPT-3.5 · Machine Translation ·

2023 年 12 月 13 日

Evaluating Gender Bias in the Translation of Gender-Neutral Languages into English

Spencer Rarrick,Ranjita Naik,Sundar Poudel,Vishal Chowdhary

Machine Translation (MT) continues to improve in quality and adoption, yet the inadvertent perpetuation of gender bias remains a significant concern. Despite numerous studies into gender bias in translations from gender-neutral languages such as Turkish into more strongly gendered languages like English, there are no benchmarks for evaluating this phenomenon or for assessing mitigation strategies. To address this gap, we introduce GATE X-E, an extension to the GATE (Rarrick et al., 2023) corpus, that consists of human translations from Turkish, Hungarian, Finnish, and Persian into English. Each translation is accompanied by feminine, masculine, and neutral variants for each possible gender interpretation. The dataset, which contains between 1250 and 1850 instances for each of the four language pairs, features natural sentences with a wide range of sentence lengths and domains, challenging translation rewriters on various linguistic phenomena. Additionally, we present an English gender rewriting solution built on GPT-3.5 Turbo and use GATE X-E to evaluate it. We open source our contributions to encourage further research on gender debiasing.

INTERACT · Integration · Pair · 置信度 · contrastive ·

2023 年 12 月 11 日

Pedestrian and Passenger Interaction with Autonomous Vehicles: Field Study in a Crosswalk Scenario

Rubén Izquierdo,Javier Alonso,Ola Benderius,Miguel ángel Sotelo,David Fernández Llorca

from arxiv, Submitted to the IEEE TIV; 13 pages, 13 figures, 7 tables. arXiv admin note: text overlap with arXiv:2307.12708

This study presents the outcomes of empirical investigations pertaining to human-vehicle interactions involving an autonomous vehicle equipped with both internal and external Human Machine Interfaces (HMIs) within a crosswalk scenario. The internal and external HMIs were integrated with implicit communication techniques, incorporating a combination of gentle and aggressive braking maneuvers within the crosswalk. Data were collected through a combination of questionnaires and quantifiable metrics, including pedestrian decision to cross related to the vehicle distance and speed. The questionnaire responses reveal that pedestrians experience enhanced safety perceptions when the external HMI and gentle braking maneuvers are used in tandem. In contrast, the measured variables demonstrate that the external HMI proves effective when complemented by the gentle braking maneuver. Furthermore, the questionnaire results highlight that the internal HMI enhances passenger confidence only when paired with the aggressive braking maneuver.

Automator · MP3 · Analysis · 假陰性 · 掩碼 ·

2023 年 12 月 10 日

StegoHound: A Novel Multi-Approaches Method for Efficient and Effective Identification and Extraction of Digital Evidence Masked by Steganographic Techniques in WAV and MP3 Files

Mohamed C. Ghanem,Maider D. Uribarri,Ramzi Djemai,Dipo Dunsin,Istteffanny I. Araujo

from arxiv, Journal of Information Security and Cybercrimes Research- Post Review V3.1

Anti-forensics techniques particularly steganography and cryptography have become increasingly pressing issues that affect the current digital forensics practice. This paper advances the automation of hidden evidence extraction in the context of audio files by proposing a novel multi-approaches method which enables the correlation between unprocessed artefacts, indexed and live forensics analysis and traditional Steganographic and Cryptographic detection techniques. In this work, we opted for experimental research methodology in the form of a quantitative analysis of the efficiency of the proposed automation detecting and extracting hidden artefacts in WAV and MP3 audio files by comparing it to standard industry systems. This work advances the current automation in extracting evidence hidden by Cryptographic and Steganographic techniques during forensics investigations, the proposed multi-approaches demonstrated a clear enhancement in terms of coverage and accuracy notably on large audio files (MP3 and WAV) for which the manual forensics analysis is complex, time-consuming and requires significant expertise. Nonetheless, the proposed multi-approach automation may occasionally produce false positives (detecting steganography where none exists) or false negatives (failing to detect steganography that is present) but overall achieve a good balance between efficiently and effectively detecting hidden evidence and minimising the false negative which validates its reliability.

Processing（編程語言） · Taxonomy · Cognition · Attention · 蒸餾 ·

2023 年 9 月 27 日

A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future

Zheng Chu,Jingchang Chen,Qianglong Chen,Weijiang Yu,Tao He,Haotian Wang,Weihua Peng,Ming Liu,Bing Qin,Ting Liu

from arxiv, Resources are available at //github.com/zchuz/CoT-Reasoning-Survey

Chain-of-thought reasoning, a cognitive process fundamental to human intelligence, has garnered significant attention in the realm of artificial intelligence and natural language processing. However, there still remains a lack of a comprehensive survey for this arena. To this end, we take the first step and present a thorough survey of this research field carefully and widely. We use X-of-Thought to refer to Chain-of-Thought in a broad sense. In detail, we systematically organize the current research according to the taxonomies of methods, including XoT construction, XoT structure variants, and enhanced XoT. Additionally, we describe XoT with frontier applications, covering planning, tool use, and distillation. Furthermore, we address challenges and discuss some future directions, including faithfulness, multi-modal, and theory. We hope this survey serves as a valuable resource for researchers seeking to innovate within the domain of chain-of-thought reasoning.

AIGC · ChatGPT · MoDELS · GaN · AI ·

2023 年 3 月 7 日

A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT

Yihan Cao,Siyu Li,Yixin Liu,Zhiling Yan,Yutong Dai,Philip S. Yu,Lichao Sun

from arxiv, 44 pages, 15 figures

Recently, ChatGPT, along with DALL-E-2 and Codex,has been gaining significant attention from society. As a result, many individuals have become interested in related resources and are seeking to uncover the background and secrets behind its impressive performance. In fact, ChatGPT and other Generative AI (GAI) techniques belong to the category of Artificial Intelligence Generated Content (AIGC), which involves the creation of digital content, such as images, music, and natural language, through AI models. The goal of AIGC is to make the content creation process more efficient and accessible, allowing for the production of high-quality content at a faster pace. AIGC is achieved by extracting and understanding intent information from instructions provided by human, and generating the content according to its knowledge and the intent information. In recent years, large-scale models have become increasingly important in AIGC as they provide better intent extraction and thus, improved generation results. With the growth of data and the size of the models, the distribution that the model can learn becomes more comprehensive and closer to reality, leading to more realistic and high-quality content generation. This survey provides a comprehensive review on the history of generative models, and basic components, recent advances in AIGC from unimodal interaction and multimodal interaction. From the perspective of unimodality, we introduce the generation tasks and relative models of text and image. From the perspective of multimodality, we introduce the cross-application between the modalities mentioned above. Finally, we discuss the existing open problems and future challenges in AIGC.

Processing（編程語言） · 推斷 · NLP · Computational Linguistics · 估計/估計量 ·

2021 年 9 月 2 日

Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond

Amir Feder,Katherine A. Keith,Emaad Manzoor,Reid Pryzant,Dhanya Sridhar,Zach Wood-Doughty,Jacob Eisenstein,Justin Grimmer,Roi Reichart,Margaret E. Roberts,Brandon M. Stewart,Victor Veitch,Diyi Yang

A fundamental goal of scientific research is to learn about causal relationships. However, despite its critical role in the life and social sciences, causality has not had the same importance in Natural Language Processing (NLP), which has traditionally placed more emphasis on predictive tasks. This distinction is beginning to fade, with an emerging area of interdisciplinary research at the convergence of causal inference and language processing. Still, research on causality in NLP remains scattered across domains without unified definitions, benchmark datasets and clear articulations of the remaining challenges. In this survey, we consolidate research across academic areas and situate it in the broader NLP landscape. We introduce the statistical challenge of estimating causal effects, encompassing settings where text is used as an outcome, treatment, or as a means to address confounding. In addition, we explore potential uses of causal inference to improve the performance, robustness, fairness, and interpretability of NLP models. We thus provide a unified overview of causal inference for the computational linguistics community.