精品夜色国产国偷自产乱码,亚洲AV永久少妇精品一区在线

Context: Deep learning has achieved remarkable progress in various domains. However, like traditional software systems, deep learning systems contain bugs, which can have severe impacts, as evidenced by crashes involving autonomous vehicles. Despite substantial advancements in deep learning techniques, little research has focused on reproducing deep learning bugs, which hinders resolving them. Existing literature suggests that only 3% of deep learning bugs are reproducible, underscoring the need for further research. Objective: This paper examines the reproducibility of deep learning bugs. We identify edit actions and useful information that could improve deep learning bug reproducibility. Method: First, we construct a dataset of 668 deep learning bugs from Stack Overflow and Defects4ML across 3 frameworks and 22 architectures. Second, we select 102 bugs using stratified sampling and try to determine their reproducibility. While reproducing these bugs, we identify edit actions and useful information necessary for their reproduction. Third, we used the Apriori algorithm to identify useful information and edit actions required to reproduce specific bug types. Finally, we conduct a user study with 22 developers to assess the effectiveness of our findings in real-life settings. Results: We successfully reproduced 85 bugs and identified ten edit actions and five useful information categories that can help us reproduce deep learning bugs. Our findings improved bug reproducibility by 22.92% and reduced reproduction time by 24.35% based on our user study. Conclusions: Our research addresses the critical issue of deep learning bug reproducibility. Practitioners and researchers can leverage our findings to improve deep learning bug reproducibility.

相關內容

Learning

關注 12

大語言模型 · 語言模型化 · MoDELS · 輸出 · 可理解性 ·

2024 年 2 月 21 日

Beyond Probabilities: Unveiling the Misalignment in Evaluating Large Language Models

Chenyang Lyu,Minghao Wu,Alham Fikri Aji

Large Language Models (LLMs) have demonstrated remarkable capabilities across various applications, fundamentally reshaping the landscape of natural language processing (NLP) research. However, recent evaluation frameworks often rely on the output probabilities of LLMs for predictions, primarily due to computational constraints, diverging from real-world LLM usage scenarios. While widely employed, the efficacy of these probability-based evaluation strategies remains an open research question. This study aims to scrutinize the validity of such probability-based evaluation methods within the context of using LLMs for Multiple Choice Questions (MCQs), highlighting their inherent limitations. Our empirical investigation reveals that the prevalent probability-based evaluation method inadequately aligns with generation-based prediction. Furthermore, current evaluation frameworks typically assess LLMs through predictive tasks based on output probabilities rather than directly generating responses, owing to computational limitations. We illustrate that these probability-based approaches do not effectively correspond with generative predictions. The outcomes of our study can enhance the understanding of LLM evaluation methodologies and provide insights for future research in this domain.

INFORMS · 泛化理論 · Analysis · MoDELS · 圖 ·

2024 年 2 月 20 日

Rethinking Information Structures in RLHF: Reward Generalization from a Graph Theory Perspective

Tianyi Qiu,Fanzhi Zeng,Jiaming Ji,Dong Yan,Kaile Wang,Jiayi Zhou,Han Yang,Josef Dai,Xuehai Pan,Yaodong Yang

There is a trilemma in reinforcement learning from human feedback (RLHF): the incompatibility between highly diverse contexts, low labeling cost, and reliable alignment performance. Here we aim to mitigate such incompatibility through the design of dataset information structures during reward modeling, and meanwhile propose new, generalizable methods of analysis that have wider applications, including potentially shedding light on goal misgeneralization. Specifically, we first reexamine the RLHF process and propose a theoretical framework portraying it as an autoencoding process over text distributions. Our framework formalizes the RLHF objective of ensuring distributional consistency between human preference and large language model (LLM) behavior. Based on this framework, we introduce a new method to model generalization in the reward modeling stage of RLHF, the induced Bayesian network (IBN). Drawing from random graph theory and causal analysis, it enables empirically grounded derivation of generalization error bounds, a key improvement over classical methods of generalization analysis. An insight from our analysis is the superiority of the tree-based information structure in reward modeling, compared to chain-based baselines in conventional RLHF methods. We derive that in complex contexts with limited data, the tree-based reward model (RM) induces up to $\Theta(\log n/\log\log n)$ times less variance than chain-based RM where $n$ is the dataset size. As validation, we demonstrate that on three NLP tasks, the tree-based RM achieves 65% win rate on average against chain-based baselines. Looking ahead, we hope to extend the IBN analysis to help understand the phenomenon of goal misgeneralization.

詞表 · INFORMS · 后向 · Projection · 語言模型化 ·

2024 年 2 月 20 日

Backward Lens: Projecting Language Model Gradients into the Vocabulary Space

Shahar Katz,Yonatan Belinkov,Mor Geva,Lior Wolf

Understanding how Transformer-based Language Models (LMs) learn and recall information is a key goal of the deep learning community. Recent interpretability methods project weights and hidden states obtained from the forward pass to the models' vocabularies, helping to uncover how information flows within LMs. In this work, we extend this methodology to LMs' backward pass and gradients. We first prove that a gradient matrix can be cast as a low-rank linear combination of its forward and backward passes' inputs. We then develop methods to project these gradients into vocabulary items and explore the mechanics of how new information is stored in the LMs' neurons.

ML · Learning · 同態加密 · Backbone · Processing（編程語言） ·

2024 年 2 月 18 日

You Still See Me: How Data Protection Supports the Architecture of ML Surveillance

Rui-Jie Yew,Lucy Qin,Suresh Venkatasubramanian

from arxiv, A version of this work was accepted at the 2023 NeurIPS Workshop on Regulatable ML

Human data forms the backbone of machine learning. Data protection laws thus have strong bearing on how ML systems are governed. Given that most requirements in data protection laws accompany the processing of personal data, organizations have an incentive to keep their data out of legal scope. This makes the development and application of certain privacy-preserving techniques--data protection techniques--an important strategy for ML compliance. In this paper, we examine the impact of a rhetoric that deems data wrapped in these techniques as data that is "good-to-go". We show how their application in the development of ML systems--from private set intersection as part of dataset curation to homomorphic encryption and federated learning as part of model computation--can further support individual monitoring and data consolidation. With data accumulation at the core of how the ML pipeline is configured, we argue that data protection techniques are often instrumentalized in ways that support infrastructures of surveillance, rather than in ways that protect individuals associated with data. Finally, we propose technology and policy strategies to evaluate data protection techniques in light of the protections they actually confer. We conclude by highlighting the role that technologists might play in devising policies that combat surveillance ML technologies.

MoDELS · INFORMS · Learning · Taxonomy · Extensibility ·

2024 年 2 月 18 日

A Survey of What to Share in Federated Learning: Perspectives on Model Utility, Privacy Leakage, and Communication Efficiency

Jiawei Shao,Zijian Li,Wenqiang Sun,Tailin Zhou,Yuchang Sun,Lumin Liu,Zehong Lin,Yuyi Mao,Jun Zhang

Federated learning (FL) has emerged as a secure paradigm for collaborative training among clients. Without data centralization, FL allows clients to share local information in a privacy-preserving manner. This approach has gained considerable attention, promoting numerous surveys to summarize the related works. However, the majority of these surveys concentrate on FL methods that share model parameters during the training process, while overlooking the possibility of sharing local information in other forms. In this paper, we present a systematic survey from a new perspective of what to share in FL, with an emphasis on the model utility, privacy leakage, and communication efficiency. First, we present a new taxonomy of FL methods in terms of three sharing methods, which respectively share model, synthetic data, and knowledge. Second, we analyze the vulnerability of different sharing methods to privacy attacks and review the defense mechanisms. Third, we conduct extensive experiments to compare the learning performance and communication overhead of various sharing methods in FL. Besides, we assess the potential privacy leakage through model inversion and membership inference attacks, while comparing the effectiveness of various defense approaches. Finally, we identify future research directions and conclude the survey.

Engineering · MoDELS · ML · Processing（編程語言） · 機器學習建模 ·

2024 年 2 月 16 日

An Exploratory Study of V-Model in Building ML-Enabled Software: A Systems Engineering Perspective

Jie JW Wu

from arxiv, 11 pages, 2 figures, 2 tables. Accepted at CAIN 2024 (3rd International Conference on AI Engineering - Software Engineering for AI)

Machine learning (ML) components are being added to more and more critical and impactful software systems, but the software development process of real-world production systems from prototyped ML models remains challenging with additional complexity and interdisciplinary collaboration challenges. This poses difficulties in using traditional software lifecycle models such as waterfall, spiral, or agile models when building \textit{ML-enabled systems}. In this research, we apply a Systems Engineering lens to investigate the use of V-Model in addressing the interdisciplinary collaboration challenges when building ML-enabled systems. By interviewing practitioners from software companies, we established a set of 8 propositions for using V-Model to manage interdisciplinary collaborations when building products with ML components. Based on the propositions, we found that despite requiring additional efforts, the characteristics of V-Model align effectively with several collaboration challenges encountered by practitioners when building ML-enabled systems. We recommend future research to investigate new process models that leverage the characteristics of V-Model such as the system decomposition, clear system boundary, and consistency of Validation \& Verification (V\&V) for building ML-enabled systems.

知識 (knowledge) · Processing（編程語言） · 圖 · NLP · 知識圖譜 ·

2022 年 9 月 30 日

A Decade of Knowledge Graphs in Natural Language Processing: A Survey

Phillip Schneider,Tim Schopf,Juraj Vladika,Mikhail Galkin,Elena Simperl,Florian Matthes

from arxiv, Accepted to AACL-IJCNLP 2022

In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years. Given the increasing amount of research work in this area, several KG-related approaches have been surveyed in the NLP research community. However, a comprehensive study that categorizes established topics and reviews the maturity of individual research streams remains absent to this day. Contributing to closing this gap, we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work.

數據增強 · Processing（編程語言） · Better · 多樣性 · 測試數據 ·

2021 年 10 月 5 日

Data Augmentation Approaches in Natural Language Processing: A Survey

Bohan Li,Yutai Hou,Wanxiang Che

As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where deep learning techniques may fail. It is widely applied in computer vision then introduced to natural language processing and achieves improvements in many tasks. One of the main focuses of the DA methods is to improve the diversity of training data, thereby helping the model to better generalize to unseen testing data. In this survey, we frame DA methods into three categories based on the diversity of augmented data, including paraphrasing, noising, and sampling. Our paper sets out to analyze DA methods in detail according to the above categories. Further, we also introduce their applications in NLP tasks as well as the challenges.

DNN · 深度學習 · 學成 · MoDELS · 有向 ·

2021 年 9 月 13 日

Explainable Deep Learning: A Field Guide for the Uninitiated

Gabrielle Ras,Ning Xie,Marcel van Gerven,Derek Doran

from arxiv, Survey paper on Explainable Deep Learning, 70 pages including references, 13 figures, 5 tables

Deep neural networks (DNNs) have become a proven and indispensable machine learning tool. As a black-box model, it remains difficult to diagnose what aspects of the model's input drive the decisions of a DNN. In countless real-world domains, from legislation and law enforcement to healthcare, such diagnosis is essential to ensure that DNN decisions are driven by aspects appropriate in the context of its use. The development of methods and studies enabling the explanation of a DNN's decisions has thus blossomed into an active, broad area of research. A practitioner wanting to study explainable deep learning may be intimidated by the plethora of orthogonal directions the field has taken. This complexity is further exacerbated by competing definitions of what it means ``to explain'' the actions of a DNN and to evaluate an approach's ``ability to explain''. This article offers a field guide to explore the space of explainable deep learning aimed at those uninitiated in the field. The field guide: i) Introduces three simple dimensions defining the space of foundational methods that contribute to explainable deep learning, ii) discusses the evaluations for model explanations, iii) places explainability in the context of other related deep learning research areas, and iv) finally elaborates on user-oriented explanation designing and potential future directions on explainable deep learning. We hope the guide is used as an easy-to-digest starting point for those just embarking on research in this field.

MoDELS · 協同過濾 · INFORMS · Neural Networks · Networks ·

2021 年 4 月 27 日

A Survey on Neural Recommendation: From Collaborative Filtering to Content and Context Enriched Recommendation

Le Wu,Xiangnan He,Xiang Wang,Kun Zhang,Meng Wang

from arxiv, In submission

Influenced by the stunning success of deep learning in computer vision and language understanding, research in recommendation has shifted to inventing new recommender models based on neural networks. In recent years, we have witnessed significant progress in developing neural recommender models, which generalize and surpass traditional recommender models owing to the strong representation power of neural networks. In this survey paper, we conduct a systematic review on neural recommender models, aiming to summarize the field to facilitate future progress. Distinct from existing surveys that categorize existing methods based on the taxonomy of deep learning techniques, we instead summarize the field from the perspective of recommendation modeling, which could be more instructive to researchers and practitioners working on recommender systems. Specifically, we divide the work into three types based on the data they used for recommendation modeling: 1) collaborative filtering models, which leverage the key source of user-item interaction data; 2) content enriched models, which additionally utilize the side information associated with users and items, like user profile and item knowledge graph; and 3) context enriched models, which account for the contextual information associated with an interaction, such as time, location, and the past interactions. After reviewing representative works for each type, we finally discuss some promising directions in this field, including benchmarking recommender systems, graph reasoning based recommendation models, and explainable and fair recommendations for social good.