精品自在线观看影片天天看,日韩精品大片一区二区三区四区

The fruits of science are relationships made comprehensible, often by way of approximation. While deep learning is an extremely powerful way to find relationships in data, its use in science has been hindered by the difficulty of understanding the learned relationships. The Information Bottleneck (IB) is an information theoretic framework for understanding a relationship between an input and an output in terms of a trade-off between the fidelity and complexity of approximations to the relationship. Here we show that a crucial modification -- distributing bottlenecks across multiple components of the input -- opens fundamentally new avenues for interpretable deep learning in science. The Distributed Information Bottleneck throttles the downstream complexity of interactions between the components of the input, deconstructing a relationship into meaningful approximations found through deep learning without requiring custom-made datasets or neural network architectures. Applied to a complex system, the approximations illuminate aspects of the system's nature by restricting -- and monitoring -- the information about different components incorporated into the approximation. We demonstrate the Distributed IB's explanatory utility in systems drawn from applied mathematics and condensed matter physics. In the former, we deconstruct a Boolean circuit into approximations that isolate the most informative subsets of input components without requiring exhaustive search. In the latter, we localize information about future plastic rearrangement in the static structure of a sheared glass, and find the information to be more or less diffuse depending on the system's preparation. By way of a principled scheme of approximations, the Distributed IB brings much-needed interpretability to deep learning and enables unprecedented analysis of information flow through a system.

相關內容

INFORMS

關注 10

《計算機信息》雜志發表高質量的論文，擴大了運籌學和計算的范圍，尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文，以及描述新的和有用的軟件工具的論文。官網鏈接： · 估計/估計量 · Analysis · Nuance · 講稿 ·

2022 年 6 月 5 日

The Subtype-Free Average Causal Effect for Heterogeneous Disease Etiology

Amit Sasson,Molin Wang,Shuji Ogino,Daniel Nevo

Studies have shown that the effect an exposure may have on a disease can vary for different subtypes of the same disease. However, existing approaches to estimate and compare these effects largely overlook causality. In this paper, we study the effect smoking may have on having colorectal cancer subtypes defined by a trait known as microsatellite instability (MSI). We use principal stratification to propose an alternative causal estimand, the Subtype-Free Average Causal Effect (SF-ACE). The SF-ACE is the causal effect of the exposure among those who would be free from other disease subtypes under any exposure level. We study non-parametric identification of the SF-ACE, and discuss different monotonicity assumptions, which are more nuanced than in the standard setting. As is often the case with principal stratum effects, the assumptions underlying the identification of the SF-ACE from the data are untestable and can be too strong. Therefore, we also develop sensitivity analysis methods that relax these assumptions. We present three different estimators, including a doubly-robust estimator, for the SF-ACE. We implement our methodology for data from two large cohorts to study the heterogeneity in the causal effect of smoking on colorectal cancer with respect to MSI subtypes.

Learning · 可辨認的 · 稀疏 · motivation · 回合 ·

2022 年 6 月 4 日

Causal Discovery in Heterogeneous Environments Under the Sparse Mechanism Shift Hypothesis

Ronan Perry,Julius von Kügelgen,Bernhard Sch?lkopf

from arxiv, JvK and BS are shared last authors. 10 pages + references + appendix; 11 figures

Machine learning approaches commonly rely on the assumption of independent and identically distributed (i.i.d.) data. In reality, however, this assumption is almost always violated due to distribution shifts between environments. Although valuable learning signals can be provided by heterogeneous data from changing distributions, it is also known that learning under arbitrary (adversarial) changes is impossible. Causality provides a useful framework for modeling distribution shifts, since causal models encode both observational and interventional distributions. In this work, we explore the sparse mechanism shift hypothesis, which posits that distribution shifts occur due to a small number of changing causal conditionals. Motivated by this idea, we apply it to learning causal structure from heterogeneous environments, where i.i.d. data only allows for learning an equivalence class of graphs without restrictive assumptions. We propose the Mechanism Shift Score (MSS), a score-based approach amenable to various empirical estimators, which provably identifies the entire causal structure with high probability if the sparse mechanism shift hypothesis holds. Empirically, we verify behavior predicted by the theory and compare multiple estimators and score functions to identify the best approaches in practice. Compared to other methods, we show how MSS bridges a gap by both being nonparametric as well as explicitly leveraging sparse changes.

周期的 · 單元 · 情景 · 相同 · 離散數學 ·

2022 年 6 月 3 日

The Structure of Configurations in One-Dimensional Majority Cellular Automata: From Cell Stability to Configuration Periodicity

Yonatan Nakar,Dana Ron

We study the dynamics of (synchronous) one-dimensional cellular automata with cyclical boundary conditions that evolve according to the majority rule with radius $ r $. We introduce a notion that we term cell stability with which we express the structure of the possible configurations that could emerge in this setting. Our main finding is that apart from the configurations of the form $ (0^{r+1}0^* + 1^{r+1}1^*)^* $, which are always fixed-points, the other configurations that the automata could possibly converge to, which are known to be either fixed-points or 2-cycles, have a particular spatially periodic structure. Namely, each of these configurations is of the form $ s^* $ where $ s $ consists of $ O(r^2) $ consecutive sequences of cells with the same state, each such sequence is of length at most $ r $, and the total length of $ s $ is $ O(r^2) $ as well. We show that an analogous result also holds for the minority rule.

Learning · 貝葉斯網/貝葉斯網絡 · 隨機變量 · Continuity · 結構化學習 ·

2022 年 6 月 3 日

Structure Learning for Hybrid Bayesian Networks

Wanchuang Zhu,Ngoc Lan Chi Nguyen,Sally Cripps

from arxiv, 45 pages, 4 figures, 6 tables

Bayesian networks have been used as a mechanism to represent the joint distribution of multiple random variables in a flexible yet interpretable manner. One major challenge in learning the structure of a network is how to model networks which include a mixture of continuous and discrete random variables, known as hybrid Bayesian networks. This paper reviews the literature on approaches to handle hybrid Bayesian networks. When working with hybrid Bayesian networks, typically one of two approaches is taken: either the data are considered to have a joint multivariate Gaussian distribution, irrespective of the true distribution, or continuous random variables are discretized, resulting in discrete Bayesian networks. In this paper, we show that a strategy to model all random variables as Gaussian outperforms the strategy which converts the continuous random variables to discrete. We demonstrate the superior performance of our strategy over the latter, theoretically and by simulation studies for various settings. Both strategies are also implemented on a childhood obesity data set. The two different strategies give rise to significant differences in the optimal graph structures, with the results of the simulation study suggesting that the inference from the strategy assuming all random variables are Gaussian is more reliable.

參數空間 · 可約的 · 優化器 · 講稿 · Subspace ·

2022 年 6 月 2 日

A multi-fidelity approach coupling parameter space reduction and non-intrusive POD with application to structural optimization of passenger ship hulls

Marco Tezzele,Lorenzo Fabris,Matteo Sidari,Mauro Sicchiero,Gianluigi Rozza

Nowadays, the shipbuilding industry is facing a radical change towards solutions with a smaller environmental impact. This can be achieved with low emissions engines, optimized shape designs with lower wave resistance and noise generation, and by reducing the metal raw materials used during the manufacturing. This work focuses on the last aspect by presenting a complete structural optimization pipeline for modern passenger ship hulls which exploits advanced model order reduction techniques to reduce the dimensionality of both input parameters and outputs of interest. We introduce a novel approach which incorporates parameter space reduction through active subspaces into the proper orthogonal decomposition with interpolation method. This is done in a multi-fidelity setting. We test the whole framework on a simplified model of a midship section and on the full model of a passenger ship, controlled by 20 and 16 parameters, respectively. We present a comprehensive error analysis and show the capabilities and usefulness of the methods especially during the preliminary design phase, finding new unconsidered designs while handling high dimensional parameterizations.

MoDELS · Performer · Processing（編程語言） · 學成 · 穩健性 ·

2021 年 9 月 3 日

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Paul Michel

from arxiv, PhD thesis

The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications (eg. sentiment classification, span-prediction based question answering or machine translation). However, it builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time. This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information. Moreover, it is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime. The first goal of this thesis is to characterize the different forms this shift can take in the context of natural language processing, and propose benchmarks and evaluation metrics to measure its effect on current deep learning architectures. We then proceed to take steps to mitigate the effect of distributional shift on NLP models. To this end, we develop methods based on parametric reformulations of the distributionally robust optimization framework. Empirically, we demonstrate that these approaches yield more robust models as demonstrated on a selection of realistic problems. In the third and final part of this thesis, we explore ways of efficiently adapting existing models to new domains or tasks. Our contribution to this topic takes inspiration from information geometry to derive a new gradient update rule which alleviate catastrophic forgetting issues during adaptation.

INFORMS · Performer · 隨機變量 · 優化器 · 泛化理論 ·

2020 年 12 月 22 日

Disentangled Information Bottleneck

Ziqi Pan,Li Niu,Jianfu Zhang,Liqing Zhang

from arxiv, Revised mathematical proof

The information bottleneck (IB) method is a technique for extracting information that is relevant for predicting the target random variable from the source random variable, which is typically implemented by optimizing the IB Lagrangian that balances the compression and prediction terms. However, the IB Lagrangian is hard to optimize, and multiple trials for tuning values of Lagrangian multiplier are required. Moreover, we show that the prediction performance strictly decreases as the compression gets stronger during optimizing the IB Lagrangian. In this paper, we implement the IB method from the perspective of supervised disentangling. Specifically, we introduce Disentangled Information Bottleneck (DisenIB) that is consistent on compressing source maximally without target prediction performance loss (maximum compression). Theoretical and experimental results demonstrate that our method is consistent on maximum compression, and performs well in terms of generalization, robustness to adversarial attack, out-of-distribution detection, and supervised disentangling.

Machine Learning · Principle · 可理解性 · 學成 · 監督 ·

2020 年 11 月 16 日

A Survey on the Explainability of Supervised Machine Learning

Nadia Burkart,Marco F. Huber

from arxiv, Accepted for publication at the Journal of Artificial Intelligence Research (JAIR)

Predictions obtained by, e.g., artificial neural networks have a high accuracy but humans often perceive the models as black boxes. Insights about the decision making are mostly opaque for humans. Particularly understanding the decision making in highly sensitive areas such as healthcare or fifinance, is of paramount importance. The decision-making behind the black boxes requires it to be more transparent, accountable, and understandable for humans. This survey paper provides essential definitions, an overview of the different principles and methodologies of explainable Supervised Machine Learning (SML). We conduct a state-of-the-art survey that reviews past and recent explainable SML approaches and classifies them according to the introduced definitions. Finally, we illustrate principles by means of an explanatory case study and discuss important future directions.

離散化 · 圖 · 圖形處理器 · Neural Networks · Networking ·

2019 年 3 月 28 日

Learning Discrete Structures for Graph Neural Networks

Luca Franceschi,Mathias Niepert,Massimiliano Pontil,Xiao He

from arxiv, 18 pages

Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.

多峰值 · 情感分析 · MoDELS · AIM · Tumblr ·

2018 年 5 月 25 日

Multimodal Sentiment Analysis To Explore the Structure of Emotions

Anthony Hu,Seth Flaxman

from arxiv, Accepted as a conference paper at KDD 2018

We propose a novel approach to multimodal sentiment analysis using deep neural networks combining visual analysis and natural language processing. Our goal is different than the standard sentiment analysis goal of predicting whether a sentence expresses positive or negative sentiment; instead, we aim to infer the latent emotional state of the user. Thus, we focus on predicting the emotion word tags attached by users to their Tumblr posts, treating these as "self-reported emotions." We demonstrate that our multimodal model combining both text and image features outperforms separate models based solely on either images or text. Our model's results are interpretable, automatically yielding sensible word lists associated with emotions. We explore the structure of emotions implied by our model and compare it to what has been posited in the psychology literature, and validate our model on a set of images that have been used in psychology studies. Finally, our work also provides a useful tool for the growing academic study of images - both photographs and memes - on social networks.