国产免费一区二区三区在线能观看,日韩一区国产二区不卡,亚洲精品日韩国产欧洲精品,AV福利免费在线观看

Machine learning models are increasingly being used in critical sectors, but their black-box nature has raised concerns about accountability and trust. The field of explainable artificial intelligence (XAI) or explainable machine learning (XML) has emerged in response to the need for human understanding of these models. Evolutionary computing, as a family of powerful optimization and learning tools, has significant potential to contribute to XAI/XML. In this chapter, we provide a brief introduction to XAI/XML and review various techniques in current use for explaining machine learning models. We then focus on how evolutionary computing can be used in XAI/XML, and review some approaches which incorporate EC techniques. We also discuss some open challenges in XAI/XML and opportunities for future research in this field using EC. Our aim is to demonstrate that evolutionary computing is well-suited for addressing current problems in explainability, and to encourage further exploration of these methods to contribute to the development of more transparent, trustworthy and accountable machine learning models.

相關內容

Machine Learning

關注 2241

機(ji)器(qi)學(xue)(xue)(xue)(xue)習(xi)（Machine Learning）是一個研(yan)(yan)究(jiu)(jiu)(jiu)計算學(xue)(xue)(xue)(xue)習(xi)方(fang)(fang)(fang)法(fa)(fa)的(de)(de)(de)(de)(de)(de)國際論(lun)壇。該雜(za)志發表(biao)文章(zhang)，報告廣泛的(de)(de)(de)(de)(de)(de)學(xue)(xue)(xue)(xue)習(xi)方(fang)(fang)(fang)法(fa)(fa)應(ying)(ying)用(yong)于(yu)各種學(xue)(xue)(xue)(xue)習(xi)問(wen)題(ti)的(de)(de)(de)(de)(de)(de)實(shi)質性結果。該雜(za)志的(de)(de)(de)(de)(de)(de)特色論(lun)文描述(shu)研(yan)(yan)究(jiu)(jiu)(jiu)的(de)(de)(de)(de)(de)(de)問(wen)題(ti)和方(fang)(fang)(fang)法(fa)(fa)，應(ying)(ying)用(yong)研(yan)(yan)究(jiu)(jiu)(jiu)和研(yan)(yan)究(jiu)(jiu)(jiu)方(fang)(fang)(fang)法(fa)(fa)的(de)(de)(de)(de)(de)(de)問(wen)題(ti)。有關(guan)學(xue)(xue)(xue)(xue)習(xi)問(wen)題(ti)或(huo)(huo)方(fang)(fang)(fang)法(fa)(fa)的(de)(de)(de)(de)(de)(de)論(lun)文通過實(shi)證(zheng)研(yan)(yan)究(jiu)(jiu)(jiu)、理(li)論(lun)分析(xi)或(huo)(huo)與心(xin)理(li)現象的(de)(de)(de)(de)(de)(de)比較提供(gong)了堅(jian)實(shi)的(de)(de)(de)(de)(de)(de)支(zhi)持(chi)。應(ying)(ying)用(yong)論(lun)文展示(shi)了如何(he)應(ying)(ying)用(yong)學(xue)(xue)(xue)(xue)習(xi)方(fang)(fang)(fang)法(fa)(fa)來解決(jue)重要(yao)的(de)(de)(de)(de)(de)(de)應(ying)(ying)用(yong)問(wen)題(ti)。研(yan)(yan)究(jiu)(jiu)(jiu)方(fang)(fang)(fang)法(fa)(fa)論(lun)文改進了機(ji)器(qi)學(xue)(xue)(xue)(xue)習(xi)的(de)(de)(de)(de)(de)(de)研(yan)(yan)究(jiu)(jiu)(jiu)方(fang)(fang)(fang)法(fa)(fa)。所有的(de)(de)(de)(de)(de)(de)論(lun)文都以其他研(yan)(yan)究(jiu)(jiu)(jiu)人(ren)員可以驗證(zheng)或(huo)(huo)復(fu)制的(de)(de)(de)(de)(de)(de)方(fang)(fang)(fang)式(shi)描述(shu)了支(zhi)持(chi)證(zheng)據(ju)。論(lun)文還(huan)詳細說明了學(xue)(xue)(xue)(xue)習(xi)的(de)(de)(de)(de)(de)(de)組成部分，并討論(lun)了關(guan)于(yu)知識表(biao)示(shi)和性能任務的(de)(de)(de)(de)(de)(de)假設。官網地(di)址(zhi)：

估計/估計量 · 泛函 · 預測器/決策函數 · Machine Learning · Learning ·

2023 年 8 月 17 日

Average partial effect estimation using double machine learning

Harvey Klyne,Rajen D. Shah

from arxiv, 61 pages, 4 figures

Single-parameter summaries of variable effects are desirable for ease of interpretation, but linear models, which would deliver these, may fit poorly to the data. A modern approach is to estimate the average partial effect -- the average slope of the regression function with respect to the predictor of interest -- using a doubly robust semiparametric procedure. Most existing work has focused on specific forms of nuisance function estimators. We extend the scope to arbitrary plug-in nuisance function estimation, allowing for the use of modern machine learning methods which in particular may deliver non-differentiable regression function estimates. Our procedure involves resmoothing a user-chosen first-stage regression estimator to produce a differentiable version, and modelling the conditional distribution of the predictors through a location-scale model. We show that our proposals lead to a semiparametric efficient estimator under relatively weak assumptions. Our theory makes use of a new result on the sub-Gaussianity of Lipschitz score functions that may be of independent interest. We demonstrate the attractive numerical performance of our approach in a variety of settings including ones with misspecification.

泛化理論 · Machine Learning · Learning · Extensibility · 梯度消失問題 ·

2023 年 8 月 17 日

Neural oscillators for generalization of physics-informed machine learning

Taniya Kapoor,Abhishek Chandra,Daniel M. Tartakovsky,Hongrui Wang,Alfredo Nunez,Rolf Dollevoet

A primary challenge of physics-informed machine learning (PIML) is its generalization beyond the training domain, especially when dealing with complex physical problems represented by partial differential equations (PDEs). This paper aims to enhance the generalization capabilities of PIML, facilitating practical, real-world applications where accurate predictions in unexplored regions are crucial. We leverage the inherent causality and temporal sequential characteristics of PDE solutions to fuse PIML models with recurrent neural architectures based on systems of ordinary differential equations, referred to as neural oscillators. Through effectively capturing long-time dependencies and mitigating the exploding and vanishing gradient problem, neural oscillators foster improved generalization in PIML tasks. Extensive experimentation involving time-dependent nonlinear PDEs and biharmonic beam equations demonstrates the efficacy of the proposed approach. Incorporating neural oscillators outperforms existing state-of-the-art methods on benchmark problems across various metrics. Consequently, the proposed method improves the generalization capabilities of PIML, providing accurate solutions for extrapolation and prediction beyond the training data.

Learning · 歸納邏輯程序設計 · 可約的 · ILP · 預測準確率 ·

2023 年 8 月 17 日

Learning logic programs by combining programs

Andrew Cropper,Céline Hocquette

The goal of inductive logic programming is to induce a logic program (a set of logical rules) that generalises training examples. Inducing programs with many rules and literals is a major challenge. To tackle this challenge, we introduce an approach where we learn small non-separable programs and combine them. We implement our approach in a constraint-driven ILP system. Our approach can learn optimal and recursive programs and perform predicate invention. Our experiments on multiple domains, including game playing and program synthesis, show that our approach can drastically outperform existing approaches in terms of predictive accuracies and learning times, sometimes reducing learning times from over an hour to a few seconds.

預測器/決策函數 · 向量化 · 樣本 · 散度 · MoDELS ·

2023 年 8 月 17 日

The conditionally studentized test for high-dimensional parametric regressions

Feng Liang,Chuhan Wang,jiaqi Huang,Lixing Zhu

from arxiv, 35 pages, 2 figures

This paper studies model checking for general parametric regression models having no dimension reduction structures on the predictor vector. Using any U-statistic type test as an initial test, this paper combines the sample-splitting and conditional studentization approaches to construct a COnditionally Studentized Test (COST). Whether the initial test is global or local smoothing-based; the dimension of the predictor vector and the number of parameters are fixed or diverge at a certain rate, the proposed test always has a normal weak limit under the null hypothesis. When the dimension of the predictor vector diverges to infinity at faster rate than the number of parameters, even the sample size, these results are still available under some conditions. This shows the potential of our method to handle higher dimensional problems. Further, the test can detect the local alternatives distinct from the null hypothesis at the fastest possible rate of convergence in hypothesis testing. We also discuss the optimal sample splitting in power performance. The numerical studies offer information on its merits and limitations in finite sample cases including the setting where the dimension of predictor vector equals the sample size. As a generic methodology, it could be applied to other testing problems.

Performer · Learning · Machine Learning · 原點 · Performance ·

2023 年 8 月 16 日

Enhancement attacks in biomedical machine learning

Matthew Rosenblatt,Javid Dadashkarimi,Dustin Scheinost

from arxiv, 12 pages, 3 figures

The prevalence of machine learning in biomedical research is rapidly growing, yet the trustworthiness of such research is often overlooked. While some previous works have investigated the ability of adversarial attacks to degrade model performance in medical imaging, the ability to falsely improve performance via recently-developed "enhancement attacks" may be a greater threat to biomedical machine learning. In the spirit of developing attacks to better understand trustworthiness, we developed two techniques to drastically enhance prediction performance of classifiers with minimal changes to features: 1) general enhancement of prediction performance, and 2) enhancement of a particular method over another. Our enhancement framework falsely improved classifiers' accuracy from 50% to almost 100% while maintaining high feature similarities between original and enhanced data (Pearson's r's>0.99). Similarly, the method-specific enhancement framework was effective in falsely improving the performance of one method over another. For example, a simple neural network outperformed logistic regression by 17% on our enhanced dataset, although no performance differences were present in the original dataset. Crucially, the original and enhanced data were still similar (r=0.99). Our results demonstrate the feasibility of minor data manipulations to achieve any desired prediction performance, which presents an interesting ethical challenge for the future of biomedical machine learning. These findings emphasize the need for more robust data provenance tracking and other precautionary measures to ensure the integrity of biomedical machine learning research.

泛函 · 情景 · 規范化的 · 變換 · 圖 ·

2023 年 8 月 13 日

Necessary and sufficient conditions for Boolean satisfiability

Stepan G. Margaryan

from arxiv, 59 pages

The research in this article aims to find conditions of an algorithmic nature that are necessary and sufficient to transform any Boolean function in conjunctive normal form into a specific form that guarantees the satisfiability of this function. To find such conditions, we use the concept of a special covering of a set introduced in [13], and investigate the connection between this concept and the notion of satisfiability of Boolean functions. As shown, the problem of existence of a special covering for a set is equivalent to the Boolean satisfiability problem. Thus, an important result is the proof of the existence of necessary and sufficient conditions that make it possible to find out if there is a special covering for the set under the special decomposition. This result allows us to formulate the necessary and sufficient algorithmic conditions for Boolean satisfiability, considering the function in conjunctive normal form as a set of clauses. In parallel, as a result of the aforementioned algorithmic procedure, we obtain the values of the variables that ensure the satisfiability of this function. The terminology used related to graph theory, set theory, Boolean functions and complexity theory is consistent with the terminology in [1], [2], [3], [4]. The newly introduced terms are not found in use by other authors and do not contradict to other terms.

多峰值 · Learning · 圖 · 表示學習 · MoDELS ·

2022 年 9 月 7 日

Geometric multimodal representation learning

Yasha Ektefaie,George Dasoulas,Ayush Noori,Maha Farhat,Marinka Zitnik

from arxiv, 28 pages, 5 figures, 2 boxes

Graph-centric artificial intelligence (graph AI) has achieved remarkable success in modeling interacting systems prevalent in nature, from dynamical systems in biology to particle physics. The increasing heterogeneity of data calls for graph neural architectures that can combine multiple inductive biases. However, combining data from various sources is challenging because appropriate inductive bias may vary by data modality. Multimodal learning methods fuse multiple data modalities while leveraging cross-modal dependencies to address this challenge. Here, we survey 140 studies in graph-centric AI and realize that diverse data types are increasingly brought together using graphs and fed into sophisticated multimodal models. These models stratify into image-, language-, and knowledge-grounded multimodal learning. We put forward an algorithmic blueprint for multimodal graph learning based on this categorization. The blueprint serves as a way to group state-of-the-art architectures that treat multimodal data by choosing appropriately four different components. This effort can pave the way for standardizing the design of sophisticated multimodal architectures for highly complex real-world problems.

過擬合 · SimPLe · Principle · 模型評估 · 統計量 ·

2021 年 3 月 16 日

Deep learning: a statistical viewpoint

Peter L. Bartlett,Andrea Montanari,Alexander Rakhlin

The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting. We survey recent theoretical progress that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behavior of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favorable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.

學成 · 深度學習 · Continuity · 貝葉斯推斷 · Networking ·

2020 年 12 月 20 日

Recent advances in deep learning theory

Fengxiang He,Dacheng Tao

Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and negative perspectives; (5) theoretical foundations of several special structures in network architectures; and (6) the increasingly intensive concerns in ethics and security and their relationships with generalizability.

模型評估 · MoDELS · 學成 · AIM · 特化 ·

2019 年 1 月 14 日

Interpretable machine learning: definitions, methods, and applications

W. James Murdoch,Chandan Singh,Karl Kumbier,Reza Abbasi-Asl,Bin Yu

from arxiv, 11 pages

Machine-learning models have demonstrated great success in learning complex patterns that enable them to make predictions about unobserved data. In addition to using models for prediction, the ability to interpret what a model has learned is receiving an increasing amount of attention. However, this increased focus has led to considerable confusion about the notion of interpretability. In particular, it is unclear how the wide array of proposed interpretation methods are related, and what common concepts can be used to evaluate them. We aim to address these concerns by defining interpretability in the context of machine learning and introducing the Predictive, Descriptive, Relevant (PDR) framework for discussing interpretations. The PDR framework provides three overarching desiderata for evaluation: predictive accuracy, descriptive accuracy and relevancy, with relevancy judged relative to a human audience. Moreover, to help manage the deluge of interpretation methods, we introduce a categorization of existing techniques into model-based and post-hoc categories, with sub-groups including sparsity, modularity and simulatability. To demonstrate how practitioners can use the PDR framework to evaluate and understand interpretations, we provide numerous real-world examples. These examples highlight the often under-appreciated role played by human audiences in discussions of interpretability. Finally, based on our framework, we discuss limitations of existing methods and directions for future work. We hope that this work will provide a common vocabulary that will make it easier for both practitioners and researchers to discuss and choose from the full range of interpretation methods.