唯美清纯另类亚洲一区二区-日日狠狠久久一区二区三区色综

In this short note we formulate a stabilizer formalism in the language of noncommutative graphs. The classes of noncommutative graphs we consider are obtained via unitary representations of compact groups, and suitably chosen operators on finite-dimensional Hilbert spaces. Furthermore, in this framework, we generalize previous results in this area for determining when such noncommutative graphs have anticliques.

相關內容

圖

關注 6

XAI · 查準率/準確率 · 相似度 · 顯著圖 · 泛化理論 ·

2022 年 5 月 17 日

A psychological theory of explainability

Scott Cheng-Hsin Yang,Tomas Folke,Patrick Shafto

from arxiv, 14 pages, 2 figures, ICML (accepted, pre camera-ready version)

The goal of explainable Artificial Intelligence (XAI) is to generate human-interpretable explanations, but there are no computationally precise theories of how humans interpret AI generated explanations. The lack of theory means that validation of XAI must be done empirically, on a case-by-case basis, which prevents systematic theory-building in XAI. We propose a psychological theory of how humans draw conclusions from saliency maps, the most common form of XAI explanation, which for the first time allows for precise prediction of explainee inference conditioned on explanation. Our theory posits that absent explanation humans expect the AI to make similar decisions to themselves, and that they interpret an explanation by comparison to the explanations they themselves would give. Comparison is formalized via Shepard's universal law of generalization in a similarity space, a classic theory from cognitive science. A pre-registered user study on AI image classifications with saliency map explanations demonstrate that our theory quantitatively matches participants' predictions of the AI.

泛化理論 · 黑盒 · 學成 · INFORMS · 監督學習算法 ·

2021 年 10 月 4 日

Information-theoretic generalization bounds for black-box learning algorithms

Hrayr Harutyunyan,Maxim Raginsky,Greg Ver Steeg,Aram Galstyan

from arxiv, NeurIPS 2021

We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.

圖片分類 · 前饋網絡 · INTERACT · Networking · 前饋 ·

2021 年 5 月 7 日

ResMLP: Feedforward networks for image classification with data-efficient training

Hugo Touvron,Piotr Bojanowski,Mathilde Caron,Matthieu Cord,Alaaeldin El-Nouby,Edouard Grave,Armand Joulin,Gabriel Synnaeve,Jakob Verbeek,Hervé Jégou

We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. It is a simple residual network that alternates (i) a linear layer in which image patches interact, independently and identically across channels, and (ii) a two-layer feed-forward network in which channels interact independently per patch. When trained with a modern training strategy using heavy data-augmentation and optionally distillation, it attains surprisingly good accuracy/complexity trade-offs on ImageNet. We will share our code based on the Timm library and pre-trained models.

Continuity · state-of-the-art · 學成 · Extensibility · Networking ·

2021 年 4 月 16 日

A continual learning survey: Defying forgetting in classification tasks

Matthias De Lange,Rahaf Aljundi,Marc Masana,Sarah Parisot,Xu Jia,Ales Leonardis,Gregory Slabaugh,Tinne Tuytelaars

from arxiv, Accepted TPAMI paper, including Appendix, code publicly available

Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern 1) a taxonomy and extensive overview of the state-of-the-art, 2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner, 3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage.

過擬合 · SimPLe · Principle · 模型評估 · 統計量 ·

2021 年 3 月 16 日

Deep learning: a statistical viewpoint

Peter L. Bartlett,Andrea Montanari,Alexander Rakhlin

The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting. We survey recent theoretical progress that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behavior of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favorable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.

學成 · 深度學習 · Continuity · 貝葉斯推斷 · Networking ·

2020 年 12 月 20 日

Recent advances in deep learning theory

Fengxiang He,Dacheng Tao

Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and negative perspectives; (5) theoretical foundations of several special structures in network architectures; and (6) the increasingly intensive concerns in ethics and security and their relationships with generalizability.

哈希學習 · 圖像檢索 · 卷積神經網絡 · 優化器 · 損失函數（機器學習） ·

2020 年 6 月 10 日

A survey on deep hashing for image retrieval

Xiaopeng Zhang

Hashing has been widely used in approximate nearest search for large-scale database retrieval for its computation and storage efficiency. Deep hashing, which devises convolutional neural network architecture to exploit and extract the semantic information or feature of images, has received increasing attention recently. In this survey, several deep supervised hashing methods for image retrieval are evaluated and I conclude three main different directions for deep supervised hashing methods. Several comments are made at the end. Moreover, to break through the bottleneck of the existing hashing methods, I propose a Shadow Recurrent Hashing(SRH) method as a try. Specifically, I devise a CNN architecture to extract the semantic features of images and design a loss function to encourage similar images projected close. To this end, I propose a concept: shadow of the CNN output. During optimization process, the CNN output and its shadow are guiding each other so as to achieve the optimal solution as much as possible. Several experiments on dataset CIFAR-10 show the satisfying performance of SRH.

Neural Networks · 優化器 · Networks · 局部極小 · Networking ·

2019 年 12 月 19 日

Optimization for deep learning: theory and algorithms

Ruoyu Sun

from arxiv, 38 pages of main body; 5 pages of appendix; 12 pages of references

When and why can a neural network be successfully trained? This article provides an overview of optimization algorithms and theory for training neural networks. First, we discuss the issue of gradient explosion/vanishing and the more general issue of undesirable spectrum, and then discuss practical solutions including careful initialization and normalization methods. Second, we review generic optimization methods used in training neural networks, such as SGD, adaptive gradient methods and distributed methods, and theoretical results for these algorithms. Third, we review existing research on the global issues of neural network training, including results on bad local minima, mode connectivity, lottery ticket hypothesis and infinite-width analysis.

圖形處理器 · 圖 · INTERACT · Performer · Neural Networks ·

2019 年 11 月 6 日

Hyper-SAGNN: a self-attention based graph neural network for hypergraphs

Ruochi Zhang,Yuesong Zou,Jian Ma

Graph representation learning for hypergraphs can be used to extract patterns among higher-order interactions that are critically important in many real world problems. Current approaches designed for hypergraphs, however, are unable to handle different types of hypergraphs and are typically not generic for various learning tasks. Indeed, models that can predict variable-sized heterogeneous hyperedges have not been available. Here we develop a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes. We perform extensive evaluations on multiple datasets, including four benchmark network datasets and two single-cell Hi-C datasets in genomics. We demonstrate that Hyper-SAGNN significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification. Hyper-SAGNN will be useful for graph representation learning to uncover complex higher-order interactions in different applications.

FCN · 全卷積網絡 · 3D · 級聯 · MoDELS ·

2018 年 3 月 20 日

An application of cascaded 3D fully convolutional networks for medical image segmentation

Holger R. Roth,Hirohisa Oda,Xiangrong Zhou,Natsuki Shimizu,Ying Yang,Yuichiro Hayashi,Masahiro Oda,Michitaka Fujiwara,Kazunari Misawa,Kensaku Mori

from arxiv, Preprint accepted for publication in Computerized Medical Imaging and Graphics. Substantial extension of arXiv:1704.06382; Corrected references to figure numbers in this version

Recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of volumetric images. In this work, we show that a multi-class 3D FCN trained on manually labeled CT scans of several anatomical structures (ranging from the large organs to thin vessels) can achieve competitive segmentation results, while avoiding the need for handcrafting features or training class-specific models. To this end, we propose a two-stage, coarse-to-fine approach that will first use a 3D FCN to roughly define a candidate region, which will then be used as input to a second 3D FCN. This reduces the number of voxels the second FCN has to classify to ~10% and allows it to focus on more detailed segmentation of the organs and vessels. We utilize training and validation sets consisting of 331 clinical CT images and test our models on a completely unseen data collection acquired at a different hospital that includes 150 CT scans, targeting three anatomical organs (liver, spleen, and pancreas). In challenging organs such as the pancreas, our cascaded approach improves the mean Dice score from 68.5 to 82.2%, achieving the highest reported average score on this dataset. We compare with a 2D FCN method on a separate dataset of 240 CT scans with 18 classes and achieve a significantly higher performance in small organs and vessels. Furthermore, we explore fine-tuning our models to different datasets. Our experiments illustrate the promise and robustness of current 3D FCN based semantic segmentation of medical images, achieving state-of-the-art results. Our code and trained models are available for download: //github.com/holgerroth/3Dunet_abdomen_cascade.