午夜剧场成年免费视_男人的天堂精品视频网站_一二三四视频日本人妻中出高清_蜜桃AV无码国产丝袜在线观看_能免费看一级片的网站_精品国产一区二区三区TV_国产亚洲二级视频线上免费看

While deep neural network models offer unmatched classification performance, they are prone to learning spurious correlations in the data. Such dependencies on confounding information can be difficult to detect using performance metrics if the test data comes from the same distribution as the training data. Interpretable ML methods such as post-hoc explanations or inherently interpretable classifiers promise to identify faulty model reasoning. However, there is mixed evidence whether many of these techniques are actually able to do so. In this paper, we propose a rigorous evaluation strategy to assess an explanation technique's ability to correctly identify spurious correlations. Using this strategy, we evaluate five post-hoc explanation techniques and one inherently interpretable method for their ability to detect three types of artificially added confounders in a chest x-ray diagnosis task. We find that the post-hoc technique SHAP, as well as the inherently interpretable Attri-Net provide the best performance and can be used to reliably identify faulty model behavior.

相關內容

可辨認的

關注 4

ONCE · MoDELS · 解碼 · 模型評估 · Performance ·

2023 年 9 月 28 日

Can the Query-based Object Detector Be Designed with Fewer Stages?

Jialin Li,Weifu Fu,Yuhuan Lin,Qiang Nie,Yong Liu

Query-based object detectors have made significant advancements since the publication of DETR. However, most existing methods still rely on multi-stage encoders and decoders, or a combination of both. Despite achieving high accuracy, the multi-stage paradigm (typically consisting of 6 stages) suffers from issues such as heavy computational burden, prompting us to reconsider its necessity. In this paper, we explore multiple techniques to enhance query-based detectors and, based on these findings, propose a novel model called GOLO (Global Once and Local Once), which follows a two-stage decoding paradigm. Compared to other mainstream query-based models with multi-stage decoders, our model employs fewer decoder stages while still achieving considerable performance. Experimental results on the COCO dataset demonstrate the effectiveness of our approach.

Networking · Analysis · Weight · DSS · 講稿 ·

2023 年 9 月 28 日

Brand Network Booster: A New System for Improving Brand Connectivity

J. Cancellieri,W. Didimo,A. Fronzetti Colladon,F. Montecchiani

This paper presents a new decision support system offered for an in-depth analysis of semantic networks, which can provide insights for a better exploration of a brand's image and the improvement of its connectivity. In terms of network analysis, we show that this goal is achieved by solving an extended version of the Maximum Betweenness Improvement problem, which includes the possibility of considering adversarial nodes, constrained budgets, and weighted networks - where connectivity improvement can be obtained by adding links or increasing the weight of existing connections. We present this new system together with two case studies, also discussing its performance. Our tool and approach are useful both for network scholars and for supporting the strategic decision-making processes of marketing and communication managers.

上下文窗口 · Better · MoDELS · Flurry · 詞元分析器 ·

2023 年 9 月 27 日

Scalable Multi-Robot Collaboration with Large Language Models: Centralized or Decentralized Systems?

Yongchao Chen,Jacob Arkin,Yang Zhang,Nicholas Roy,Chuchu Fan

from arxiv, 6 pages, 8 figures

A flurry of recent work has demonstrated that pre-trained large language models (LLMs) can be effective task planners for a variety of single-robot tasks. The planning performance of LLMs is significantly improved via prompting techniques, such as in-context learning or re-prompting with state feedback, placing new importance on the token budget for the context window. An under-explored but natural next direction is to investigate LLMs as multi-robot task planners. However, long-horizon, heterogeneous multi-robot planning introduces new challenges of coordination while also pushing up against the limits of context window length. It is therefore critical to find token-efficient LLM planning frameworks that are also able to reason about the complexities of multi-robot coordination. In this work, we compare the task success rate and token efficiency of four multi-agent communication frameworks (centralized, decentralized, and two hybrid) as applied to four coordination-dependent multi-agent 2D task scenarios for increasing numbers of agents. We find that a hybrid framework achieves better task success rates across all four tasks and scales better to more agents. We further demonstrate the hybrid frameworks in 3D simulations where the vision-to-text problem and dynamical errors are considered. See our project website //yongchao98.github.io/MIT-REALM-Multi-Robot/ for prompts, videos, and code.

語音識別 · 語言模型化 · MoDELS · 可約的 · 基準 ·

2023 年 9 月 27 日

HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models

Chen Chen,Yuchen Hu,Chao-Han Huck Yang,Sabato Macro Siniscalchi,Pin-Yu Chen,Eng Siong Chng

from arxiv, Accepted to NeurIPS 2023, 24 pages. Datasets and Benchmarks Track

Advancements in deep neural networks have allowed automatic speech recognition (ASR) systems to attain human parity on several publicly available clean speech datasets. However, even state-of-the-art ASR systems experience performance degradation when confronted with adverse conditions, as a well-trained acoustic model is sensitive to variations in the speech domain, e.g., background noise. Intuitively, humans address this issue by relying on their linguistic knowledge: the meaning of ambiguous spoken terms is usually inferred from contextual cues thereby reducing the dependency on the auditory system. Inspired by this observation, we introduce the first open-source benchmark to utilize external large language models (LLMs) for ASR error correction, where N-best decoding hypotheses provide informative elements for true transcription prediction. This approach is a paradigm shift from the traditional language model rescoring strategy that can only select one candidate hypothesis as the output transcription. The proposed benchmark contains a novel dataset, HyPoradise (HP), encompassing more than 334,000 pairs of N-best hypotheses and corresponding accurate transcriptions across prevalent speech domains. Given this dataset, we examine three types of error correction techniques based on LLMs with varying amounts of labeled hypotheses-transcription pairs, which gains a significant word error rate (WER) reduction. Experimental evidence demonstrates the proposed technique achieves a breakthrough by surpassing the upper bound of traditional re-ranking based methods. More surprisingly, LLM with reasonable prompt and its generative capability can even correct those tokens that are missing in N-best list. We make our results publicly accessible for reproducible pipelines with released pre-trained models, thus providing a new evaluation paradigm for ASR error correction with LLMs.

INFORMS · 可辨認的 · Twitter · Taxonomy · Guidance ·

2023 年 9 月 26 日

A Tale of Two Cultures: Comparing Interpersonal Information Disclosure Norms on Twitter

Mainack Mondal,Anju Punuru,Tyng-Wen Scott Cheng,Kenneth Vargas,Chaz Gundry,Nathan S Driggs,Noah Schill,Nathaniel Carlson,Josh Bedwell,Jaden Q Lorenc,Isha Ghosh,Yao Li,Nancy Fulda,Xinru Page

from arxiv, This work will be presented at the 26th ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW 2023). This paper will also be published in The Proceedings of the ACM on Human Computer Interaction

We present an exploration of cultural norms surrounding online disclosure of information about one's interpersonal relationships (such as information about family members, colleagues, friends, or lovers) on Twitter. The literature identifies the cultural dimension of individualism versus collectivism as being a major determinant of offline communication differences in terms of emotion, topic, and content disclosed. We decided to study whether such differences also occur online in context of Twitter when comparing tweets posted in an individualistic (U.S.) versus a collectivist (India) society. We collected more than 2 million tweets posted in the U.S. and India over a 3 month period which contain interpersonal relationship keywords. A card-sort study was used to develop this culturally-sensitive saturated taxonomy of keywords that represent interpersonal relationships (e.g., ma, mom, mother). Then we developed a high-accuracy interpersonal disclosure detector based on dependency-parsing (F1-score: 86%) to identify when the words refer to a personal relationship of the poster (e.g., "my mom" as opposed to "a mom"). This allowed us to identify the 400K+ tweets in our data set which actually disclose information about the poster's interpersonal relationships. We used a mixed methods approach to analyze these tweets (e.g., comparing the amount of joy expressed about one's family) and found differences in emotion, topic, and content disclosed between tweets from the U.S. versus India. Our analysis also reveals how a combination of qualitative and quantitative methods are needed to uncover these differences; Using just one or the other can be misleading. This study extends the prior literature on Multi-Party Privacy and provides guidance for researchers and designers of culturally-sensitive systems.

Continuity · 學成 · Vision · 計算機視覺 · 批量學習 ·

2021 年 9 月 23 日

Recent Advances of Continual Learning in Computer Vision: An Overview

Haoxuan Qu,Hossein Rahmani,Li Xu,Bryan Williams,Jun Liu

from arxiv, 21 pages, 5 figures

In contrast to batch learning where all training data is available at once, continual learning represents a family of methods that accumulate knowledge and learn continuously with data available in sequential order. Similar to the human learning process with the ability of learning, fusing, and accumulating new knowledge coming at different time steps, continual learning is considered to have high practical significance. Hence, continual learning has been studied in various artificial intelligence tasks. In this paper, we present a comprehensive review of the recent progress of continual learning in computer vision. In particular, the works are grouped by their representative techniques, including regularization, knowledge distillation, memory, generative replay, parameter isolation, and a combination of the above techniques. For each category of these techniques, both its characteristics and applications in computer vision are presented. At the end of this overview, several subareas, where continuous knowledge accumulation is potentially helpful while continual learning has not been well studied, are discussed.

DNN · 深度學習 · 學成 · MoDELS · 有向 ·

2021 年 9 月 13 日

Explainable Deep Learning: A Field Guide for the Uninitiated

Gabrielle Ras,Ning Xie,Marcel van Gerven,Derek Doran

from arxiv, Survey paper on Explainable Deep Learning, 70 pages including references, 13 figures, 5 tables

Deep neural networks (DNNs) have become a proven and indispensable machine learning tool. As a black-box model, it remains difficult to diagnose what aspects of the model's input drive the decisions of a DNN. In countless real-world domains, from legislation and law enforcement to healthcare, such diagnosis is essential to ensure that DNN decisions are driven by aspects appropriate in the context of its use. The development of methods and studies enabling the explanation of a DNN's decisions has thus blossomed into an active, broad area of research. A practitioner wanting to study explainable deep learning may be intimidated by the plethora of orthogonal directions the field has taken. This complexity is further exacerbated by competing definitions of what it means ``to explain'' the actions of a DNN and to evaluate an approach's ``ability to explain''. This article offers a field guide to explore the space of explainable deep learning aimed at those uninitiated in the field. The field guide: i) Introduces three simple dimensions defining the space of foundational methods that contribute to explainable deep learning, ii) discusses the evaluations for model explanations, iii) places explainability in the context of other related deep learning research areas, and iv) finally elaborates on user-oriented explanation designing and potential future directions on explainable deep learning. We hope the guide is used as an easy-to-digest starting point for those just embarking on research in this field.

MoDELS · Transformer模型 · 變換 · 推斷 · 模型評估 ·

2020 年 6 月 23 日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Zhuohan Li,Eric Wallace,Sheng Shen,Kevin Lin,Kurt Keutzer,Dan Klein,Joseph E. Gonzalez

from arxiv, ICML 2020

Since hardware resources are limited, the objective of training deep learning models is typically to maximize accuracy subject to the time and memory constraints of training and inference. We study the impact of model size in this setting, focusing on Transformer models for NLP tasks that are limited by compute: self-supervised pretraining and high-resource machine translation. We first show that even though smaller Transformer models execute faster per iteration, wider and deeper models converge in significantly fewer steps. Moreover, this acceleration in convergence typically outpaces the additional computational overhead of using larger models. Therefore, the most compute-efficient training strategy is to counterintuitively train extremely large models but stop after a small number of iterations. This leads to an apparent trade-off between the training efficiency of large Transformer models and the inference efficiency of small Transformer models. However, we show that large models are more robust to compression techniques such as quantization and pruning than small models. Consequently, one can get the best of both worlds: heavily compressed, large models achieve higher accuracy than lightly compressed, small models.

文本分類 · 語言模型化 · BERT · state-of-the-art · MoDELS ·

2019 年 5 月 14 日

How to Fine-Tune BERT for Text Classification?

Chi Sun,Xipeng Qiu,Yige Xu,Xuanjing Huang

Language model pre-training has proven to be useful in learning universal language representations. As a state-of-the-art language model pre-training model, BERT (Bidirectional Encoder Representations from Transformers) has achieved amazing results in many language understanding tasks. In this paper, we conduct exhaustive experiments to investigate different fine-tuning methods of BERT on text classification task and provide a general solution for BERT fine-tuning. Finally, the proposed solution obtains new state-of-the-art results on eight widely-studied text classification datasets.

無監督 · MoDELS · Networking · 變換 · AIM ·

2019 年 3 月 27 日

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Guo-Jun Qi,Jiebo Luo

Small data challenges have emerged in many learning problems, since the success of deep neural networks often relies on the availability of a huge amount of labeled data that is expensive to collect. To address it, many efforts have been made on training complex models with small data in an unsupervised and semi-supervised fashion. In this paper, we will review the recent progresses on these two major categories of methods. A wide spectrum of small data models will be categorized in a big picture, where we will show how they interplay with each other to motivate explorations of new ideas. We will review the criteria of learning the transformation equivariant, disentangled, self-supervised and semi-supervised representations, which underpin the foundations of recent developments. Many instantiations of unsupervised and semi-supervised generative models have been developed on the basis of these criteria, greatly expanding the territory of existing autoencoders, generative adversarial nets (GANs) and other deep networks by exploring the distribution of unlabeled data for more powerful representations. While we focus on the unsupervised and semi-supervised methods, we will also provide a broader review of other emerging topics, from unsupervised and semi-supervised domain adaptation to the fundamental roles of transformation equivariance and invariance in training a wide spectrum of deep networks. It is impossible for us to write an exclusive encyclopedia to include all related works. Instead, we aim at exploring the main ideas, principles and methods in this area to reveal where we are heading on the journey towards addressing the small data challenges in this big data era.