日韩在线精品小视频,中文字幕日韩欧美爆乳在线不卡,91精品熟女老女系列,亚洲欧美日韩一区二区三区在线了

Many explainable AI (XAI) techniques strive for interpretability by providing concise salient information, such as sparse linear factors. However, users either only see inaccurate global explanations, or highly-varying local explanations. We propose to provide more detailed explanations by leveraging the human cognitive capacity to accumulate knowledge by incrementally receiving more details. Focusing on linear factor explanations (factors $\times$ values = outcome), we introduce Incremental XAI to automatically partition explanations for general and atypical instances by providing Base + Incremental factors to help users read and remember more faithful explanations. Memorability is improved by reusing base factors and reducing the number of factors shown in atypical cases. In modeling, formative, and summative user studies, we evaluated the faithfulness, memorability and understandability of Incremental XAI against baseline explanation methods. This work contributes towards more usable explanation that users can better ingrain to facilitate intuitive engagement with AI.

相關內容

分解的

關注 0

數學 · MoDELS · 可理解性 · 知識 (knowledge) · Nuance ·

2024 年 5 月 20 日

MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark

Hongwei Liu,Zilong Zheng,Yuxuan Qiao,Haodong Duan,Zhiwei Fei,Fengzhe Zhou,Wenwei Zhang,Songyang Zhang,Dahua Lin,Kai Chen

from arxiv, Project: //github.com/open-compass/MathBench

Recent advancements in large language models (LLMs) have showcased significant improvements in mathematics. However, traditional math benchmarks like GSM8k offer a unidimensional perspective, falling short in providing a holistic assessment of the LLMs' math capabilities. To address this gap, we introduce MathBench, a new benchmark that rigorously assesses the mathematical capabilities of large language models. MathBench spans a wide range of mathematical disciplines, offering a detailed evaluation of both theoretical understanding and practical problem-solving skills. The benchmark progresses through five distinct stages, from basic arithmetic to college mathematics, and is structured to evaluate models at various depths of knowledge. Each stage includes theoretical questions and application problems, allowing us to measure a model's mathematical proficiency and its ability to apply concepts in practical scenarios. MathBench aims to enhance the evaluation of LLMs' mathematical abilities, providing a nuanced view of their knowledge understanding levels and problem solving skills in a bilingual context. The project is released at //github.com/open-compass/MathBench .

文本分類 · 損失函數（機器學習） · 情感分析 · 語言模型化 · Processing（編程語言） ·

2024 年 5 月 20 日

Exploring Ordinality in Text Classification: A Comparative Study of Explicit and Implicit Techniques

Siva Rajesh Kasa,Aniket Goel,Karan Gupta,Sumegh Roychowdhury,Anish Bhanushali,Nikhil Pattisapu,Prasanna Srinivasa Murthy

from arxiv, Findings of ACL 2024

Ordinal Classification (OC) is a widely encountered challenge in Natural Language Processing (NLP), with applications in various domains such as sentiment analysis, rating prediction, and more. Previous approaches to tackle OC have primarily focused on modifying existing or creating novel loss functions that \textbf{explicitly} account for the ordinal nature of labels. However, with the advent of Pretrained Language Models (PLMs), it became possible to tackle ordinality through the \textbf{implicit} semantics of the labels as well. This paper provides a comprehensive theoretical and empirical examination of both these approaches. Furthermore, we also offer strategic recommendations regarding the most effective approach to adopt based on specific settings.

策略改進 · Processing（編程語言） · 有向 · 策略評估 · 評論員 ·

2024 年 5 月 14 日

vMFER: Von Mises-Fisher Experience Resampling Based on Uncertainty of Gradient Directions for Policy Improvement

Yiwen Zhu,Jinyi Liu,Wenya Wei,Qianyi Fu,Yujing Hu,Zhou Fang,Bo An,Jianye Hao,Tangjie Lv,Changjie Fan

from arxiv, Accepted by IJCAI 2024, with appendix

Reinforcement Learning (RL) is a widely employed technique in decision-making problems, encompassing two fundamental operations -- policy evaluation and policy improvement. Enhancing learning efficiency remains a key challenge in RL, with many efforts focused on using ensemble critics to boost policy evaluation efficiency. However, when using multiple critics, the actor in the policy improvement process can obtain different gradients. Previous studies have combined these gradients without considering their disagreements. Therefore, optimizing the policy improvement process is crucial to enhance learning efficiency. This study focuses on investigating the impact of gradient disagreements caused by ensemble critics on policy improvement. We introduce the concept of uncertainty of gradient directions as a means to measure the disagreement among gradients utilized in the policy improvement process. Through measuring the disagreement among gradients, we find that transitions with lower uncertainty of gradient directions are more reliable in the policy improvement process. Building on this analysis, we propose a method called von Mises-Fisher Experience Resampling (vMFER), which optimizes the policy improvement process by resampling transitions and assigning higher confidence to transitions with lower uncertainty of gradient directions. Our experiments demonstrate that vMFER significantly outperforms the benchmark and is particularly well-suited for ensemble structures in RL.

binary · TCS · Notability · 線性的 · 相關系數 ·

2024 年 5 月 14 日

A Survey on Complexity Measures of Pseudo-Random Sequences

Chunlei Li

Since the introduction of the Kolmogorov complexity of binary sequences in the 1960s, there have been significant advancements in the topic of complexity measures for randomness assessment, which are of fundamental importance in theoretical computer science and of practical interest in cryptography. This survey reviews notable research from the past four decades on the linear, quadratic and maximum-order complexities of pseudo-random sequences and their relations with Lempel-Ziv complexity, expansion complexity, 2-adic complexity, and correlation measures.

可交換的 · Continuity · 有偏 · 可辨認的 · 合一 ·

2024 年 5 月 14 日

A Unification of Exchangeability and Continuous Exposure and Confounder Measurement Errors: Probabilistic Exchangeability

Honghyok Kim

Exchangeability concerning a continuous exposure, X, implies no confounding bias when identifying average exposure effects of X, AEE(X). When X is measured with error (Xep), two challenges arise in identifying AEE(X). Firstly, exchangeability regarding Xep does not equal exchangeability regarding X. Secondly, the necessity of the non-differential error assumption (NDEA), overly stringent in practice, remains uncertain. To address them, this article proposes unifying exchangeability and exposure and confounder measurement errors with three novel concepts. The first, Probabilistic Exchangeability (PE), states that the outcomes of those with Xep=e are probabilistically exchangeable with the outcomes of those truly exposed to X=eT. The relationship between AEE(Xep) and AEE(X) in risk difference and ratio scales is mathematically expressed as a probabilistic certainty, termed exchangeability probability (Pe). Squared Pe (Pe.sq) quantifies the extent to which AEE(Xep) differs from AEE(X) due to exposure measurement error through mechanisms not akin to confounding mechanisms. The coefficient of determination (R.sq) in the regression of X against Xep may sometimes be sufficient to measure Pe.sq. The second concept, Emergent Pseudo Confounding (EPC), describes the bias introduced by exposure measurement error through mechanisms akin to confounding mechanisms. PE can hold when EPC is controlled for, which is weaker than NDEA. The third, Emergent Confounding, describes when bias due to confounder measurement error arises. Adjustment for E(P)C can be performed like confounding adjustment to ensure PE. This paper provides formal justifications for using AEE(Xep) and maximum insight into potential divergence of AEE(Xep) from AEE(X) and how to measure it.

INTERACT · HCI · 分解的 · 設計 · MoDELS ·

2024 年 5 月 14 日

Kawaii Computing: Scoping Out the Japanese Notion of Cute in User Experiences with Interactive Systems

Yijia Wang,Katie Seaborn

Kawaii computing is a new term for a steadily growing body of work on the Japanese notion of "cute" in human-computer interaction (HCI) research and practice. Kawaii is distinguished from general notions of cute by its experiential and culturally-sensitive nature. While it can be designed into the appearance and behaviour of interactive agents, interfaces, and systems, kawaii also refers to certain affective and cultural dimensions experienced by culturally Japanese users, i.e., kawaii user experiences (UX) and mental models of kawaii elicited by the socio-cultural context of Japan. In this scoping review, we map out the ways in which kawaii has been explored within HCI research and related fields as a factor of design and experience. We illuminate theoretical and methodological gaps and opportunities for future work on kawaii computing.

DOT · tuning · 參數空間 · Automator · 操作 ·

2024 年 5 月 12 日

Data Needs and Challenges of Quantum Dot Devices Automation: Workshop Report

Justyna P. Zwolak,Jacob M. Taylor,Reed Andrews,Jared Benson,Garnett Bryant,Donovan Buterakos,Anasua Chatterjee,Sankar Das Sarma,Mark A. Eriksson,Eli?ka Greplová,Michael J. Gullans,Fabian Hader,Tyler J. Kovach,Pranav S. Mundada,Mick Ramsey,Torbjoern Rasmussen,Brandon Severin,Anthony Sigillito,Brennan Undseth,Brian Weber

from arxiv, White paper/overview based on a workshop held at the National Institute of Standards and Technology, Gaithersburg, MD. 13 pages

Gate-defined quantum dots are a promising candidate system to realize scalable, coupled qubit systems and serve as a fundamental building block for quantum computers. However, present-day quantum dot devices suffer from imperfections that must be accounted for, which hinders the characterization, tuning, and operation process. Moreover, with an increasing number of quantum dot qubits, the relevant parameter space grows sufficiently to make heuristic control infeasible. Thus, it is imperative that reliable and scalable autonomous tuning approaches are developed. In this report, we outline current challenges in automating quantum dot device tuning and operation with a particular focus on datasets, benchmarking, and standardization. We also present ideas put forward by the quantum dot community on how to overcome them.

知識 (knowledge) · 圖 · 數學 · 表示 · 知識圖譜 ·

2022 年 11 月 7 日

Knowledge Graph Embedding: A Survey from the Perspective of Representation Spaces

Jiahang Cao,Jinyuan Fang,Zaiqiao Meng,Shangsong Liang

from arxiv, 32 pages, 6 figures

Knowledge graph embedding (KGE) is a increasingly popular technique that aims to represent entities and relations of knowledge graphs into low-dimensional semantic spaces for a wide spectrum of applications such as link prediction, knowledge reasoning and knowledge completion. In this paper, we provide a systematic review of existing KGE techniques based on representation spaces. Particularly, we build a fine-grained classification to categorise the models based on three mathematical perspectives of the representation spaces: (1) Algebraic perspective, (2) Geometric perspective, and (3) Analytical perspective. We introduce the rigorous definitions of fundamental mathematical spaces before diving into KGE models and their mathematical properties. We further discuss different KGE methods over the three categories, as well as summarise how spatial advantages work over different embedding needs. By collating the experimental results from downstream tasks, we also explore the advantages of mathematical space in different scenarios and the reasons behind them. We further state some promising research directions from a representation space perspective, with which we hope to inspire researchers to design their KGE models as well as their related applications with more consideration of their mathematical space properties.

學成 · 大數據 · 相同 · 人工智能 · 統計方法 ·

2020 年 5 月 5 日

A Survey of Learning Causality with Data: Problems and Methods

Ruocheng Guo,Lu Cheng,Jundong Li,P. Richard Hahn,Huan Liu

from arxiv, 35 pages, accepted by ACM CSUR

This work considers the question of how convenient access to copious data impacts our ability to learn causal effects and relations. In what ways is learning causality in the era of big data different from -- or the same as -- the traditional one? To answer this question, this survey provides a comprehensive and structured review of both traditional and frontier methods in learning causality and relations along with the connections between causality and machine learning. This work points out on a case-by-case basis how big data facilitates, complicates, or motivates each approach.

INFORMS · 圖 · 可約的 · 知識圖譜 · 可辨認的 ·

2018 年 8 月 29 日

Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction

Yi Luan,Luheng He,Mari Ostendorf,Hannaneh Hajishirzi

We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.