亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We study how neural networks trained by gradient descent extrapolate, i.e., what they learn outside the support of the training distribution. Previous works report mixed empirical results when extrapolating with neural networks: while feedforward neural networks, a.k.a. multilayer perceptrons (MLPs), do not extrapolate well in certain simple tasks, Graph Neural Networks (GNNs), a structured network with MLP modules, have shown some success in more complex tasks. Working towards a theoretical explanation, we identify conditions under which MLPs and GNNs extrapolate well. First, we quantify the observation that ReLU MLPs quickly converge to linear functions along any direction from the origin, which implies that ReLU MLPs do not extrapolate most nonlinear functions. But, they can provably learn a linear target function when the training distribution is sufficiently diverse. Second, in connection to analyzing the successes and limitations of GNNs, these results suggest a hypothesis for which we provide theoretical and empirical evidence: the success of GNNs in extrapolating algorithmic tasks to new data (e.g., larger graphs or edge weights) relies on encoding task-specific non-linearities in the architecture or features. Our theoretical analysis builds on a connection of over-parameterized networks to the neural tangent kernel. Empirically, our theory holds across different training settings.

相關內容

神(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)(Neural Networks)是世界(jie)上三個最古(gu)老的(de)(de)(de)神(shen)(shen)經(jing)(jing)(jing)(jing)建(jian)(jian)模學(xue)(xue)(xue)會(hui)(hui)的(de)(de)(de)檔(dang)案期刊:國際神(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)學(xue)(xue)(xue)會(hui)(hui)(INNS)、歐洲神(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)學(xue)(xue)(xue)會(hui)(hui)(ENNS)和(he)(he)日本神(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)學(xue)(xue)(xue)會(hui)(hui)(JNNS)。神(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)提(ti)供(gong)了一(yi)(yi)個論(lun)壇(tan),以(yi)發展(zhan)和(he)(he)培(pei)育一(yi)(yi)個國際社(she)(she)會(hui)(hui)的(de)(de)(de)學(xue)(xue)(xue)者和(he)(he)實(shi)踐者感(gan)興趣的(de)(de)(de)所有(you)(you)方(fang)面(mian)的(de)(de)(de)神(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)和(he)(he)相關方(fang)法的(de)(de)(de)計(ji)(ji)(ji)算智能。神(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)歡迎高質量(liang)論(lun)文(wen)的(de)(de)(de)提(ti)交,有(you)(you)助(zhu)于(yu)全面(mian)的(de)(de)(de)神(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)研究,從行為和(he)(he)大腦(nao)建(jian)(jian)模,學(xue)(xue)(xue)習算法,通過數學(xue)(xue)(xue)和(he)(he)計(ji)(ji)(ji)算分(fen)析,系(xi)統的(de)(de)(de)工(gong)(gong)程(cheng)和(he)(he)技(ji)術(shu)應用(yong),大量(liang)使(shi)用(yong)神(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)的(de)(de)(de)概念和(he)(he)技(ji)術(shu)。這一(yi)(yi)獨特(te)而廣泛的(de)(de)(de)范圍(wei)促進了生物(wu)和(he)(he)技(ji)術(shu)研究之(zhi)間的(de)(de)(de)思想交流,并有(you)(you)助(zhu)于(yu)促進對生物(wu)啟發的(de)(de)(de)計(ji)(ji)(ji)算智能感(gan)興趣的(de)(de)(de)跨學(xue)(xue)(xue)科(ke)社(she)(she)區的(de)(de)(de)發展(zhan)。因此,神(shen)(shen)經(jing)(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)編(bian)委會(hui)(hui)代表(biao)的(de)(de)(de)專(zhuan)(zhuan)家領域(yu)包括心理學(xue)(xue)(xue),神(shen)(shen)經(jing)(jing)(jing)(jing)生物(wu)學(xue)(xue)(xue),計(ji)(ji)(ji)算機科(ke)學(xue)(xue)(xue),工(gong)(gong)程(cheng),數學(xue)(xue)(xue),物(wu)理。該雜志發表(biao)文(wen)章、信件和(he)(he)評(ping)論(lun)以(yi)及給編(bian)輯的(de)(de)(de)信件、社(she)(she)論(lun)、時事、軟件調查和(he)(he)專(zhuan)(zhuan)利信息。文(wen)章發表(biao)在(zai)五個部分(fen)之(zhi)一(yi)(yi):認知科(ke)學(xue)(xue)(xue),神(shen)(shen)經(jing)(jing)(jing)(jing)科(ke)學(xue)(xue)(xue),學(xue)(xue)(xue)習系(xi)統,數學(xue)(xue)(xue)和(he)(he)計(ji)(ji)(ji)算分(fen)析、工(gong)(gong)程(cheng)和(he)(he)應用(yong)。 官網(wang)(wang)地址:

Deep models trained in supervised mode have achieved remarkable success on a variety of tasks. When labeled samples are limited, self-supervised learning (SSL) is emerging as a new paradigm for making use of large amounts of unlabeled samples. SSL has achieved promising performance on natural language and image learning tasks. Recently, there is a trend to extend such success to graph data using graph neural networks (GNNs). In this survey, we provide a unified review of different ways of training GNNs using SSL. Specifically, we categorize SSL methods into contrastive and predictive models. In either category, we provide a unified framework for methods as well as how these methods differ in each component under the framework. Our unified treatment of SSL methods for GNNs sheds light on the similarities and differences of various methods, setting the stage for developing new methods and algorithms. We also summarize different SSL settings and the corresponding datasets used in each setting. To facilitate methodological development and empirical comparison, we develop a standardized testbed for SSL in GNNs, including implementations of common baseline methods, datasets, and evaluation metrics.

Self-training algorithms, which train a model to fit pseudolabels predicted by another previously-learned model, have been very successful for learning with unlabeled data using neural networks. However, the current theoretical understanding of self-training only applies to linear models. This work provides a unified theoretical analysis of self-training with deep networks for semi-supervised learning, unsupervised domain adaptation, and unsupervised learning. At the core of our analysis is a simple but realistic ``expansion'' assumption, which states that a low-probability subset of the data must expand to a neighborhood with large probability relative to the subset. We also assume that neighborhoods of examples in different classes have minimal overlap. We prove that under these assumptions, the minimizers of population objectives based on self-training and input-consistency regularization will achieve high accuracy with respect to ground-truth labels. By using off-the-shelf generalization bounds, we immediately convert this result to sample complexity guarantees for neural nets that are polynomial in the margin and Lipschitzness. Our results help explain the empirical successes of recently proposed self-training algorithms which use input consistency regularization.

Perturbations targeting the graph structure have proven to be extremely effective in reducing the performance of Graph Neural Networks (GNNs), and traditional defenses such as adversarial training do not seem to be able to improve robustness. This work is motivated by the observation that adversarially injected edges effectively can be viewed as additional samples to a node's neighborhood aggregation function, which results in distorted aggregations accumulating over the layers. Conventional GNN aggregation functions, such as a sum or mean, can be distorted arbitrarily by a single outlier. We propose a robust aggregation function motivated by the field of robust statistics. Our approach exhibits the largest possible breakdown point of 0.5, which means that the bias of the aggregation is bounded as long as the fraction of adversarial edges of a node is less than 50\%. Our novel aggregation function, Soft Medoid, is a fully differentiable generalization of the Medoid and therefore lends itself well for end-to-end deep learning. Equipping a GNN with our aggregation improves the robustness with respect to structure perturbations on Cora ML by a factor of 3 (and 5.5 on Citeseer) and by a factor of 8 for low-degree nodes.

Graph Neural Networks (GNN) has demonstrated the superior performance in many challenging applications, including the few-shot learning tasks. Despite its powerful capacity to learn and generalize from few samples, GNN usually suffers from severe over-fitting and over-smoothing as the model becomes deep, which limit the model scalability. In this work, we propose a novel Attentive GNN to tackle these challenges, by incorporating a triple-attention mechanism, \ie node self-attention, neighborhood attention, and layer memory attention. We explain why the proposed attentive modules can improve GNN for few-shot learning with theoretical analysis and illustrations. Extensive experiments show that the proposed Attentive GNN outperforms the state-of-the-art GNN-based methods for few-shot learning over the mini-ImageNet and Tiered-ImageNet datasets, with both inductive and transductive settings.

Deep learning methods for graphs achieve remarkable performance on many node-level and graph-level prediction tasks. However, despite the proliferation of the methods and their success, prevailing Graph Neural Networks (GNNs) neglect subgraphs, rendering subgraph prediction tasks challenging to tackle in many impactful applications. Further, subgraph prediction tasks present several unique challenges, because subgraphs can have non-trivial internal topology, but also carry a notion of position and external connectivity information relative to the underlying graph in which they exist. Here, we introduce SUB-GNN, a subgraph neural network to learn disentangled subgraph representations. In particular, we propose a novel subgraph routing mechanism that propagates neural messages between the subgraph's components and randomly sampled anchor patches from the underlying graph, yielding highly accurate subgraph representations. SUB-GNN specifies three channels, each designed to capture a distinct aspect of subgraph structure, and we provide empirical evidence that the channels encode their intended properties. We design a series of new synthetic and real-world subgraph datasets. Empirical results for subgraph classification on eight datasets show that SUB-GNN achieves considerable performance gains, outperforming strong baseline methods, including node-level and graph-level GNNs, by 12.4% over the strongest baseline. SUB-GNN performs exceptionally well on challenging biomedical datasets when subgraphs have complex topology and even comprise multiple disconnected components.

Graph neural networks (GNNs) are typically applied to static graphs that are assumed to be known upfront. This static input structure is often informed purely by insight of the machine learning practitioner, and might not be optimal for the actual task the GNN is solving. In absence of reliable domain expertise, one might resort to inferring the latent graph structure, which is often difficult due to the vast search space of possible graphs. Here we introduce Pointer Graph Networks (PGNs) which augment sets or graphs with additional inferred edges for improved model expressivity. PGNs allow each node to dynamically point to another node, followed by message passing over these pointers. The sparsity of this adaptable graph structure makes learning tractable while still being sufficiently expressive to simulate complex algorithms. Critically, the pointing mechanism is directly supervised to model long-term sequences of operations on classical data structures, incorporating useful structural inductive biases from theoretical computer science. Qualitatively, we demonstrate that PGNs can learn parallelisable variants of pointer-based data structures, namely disjoint set unions and link/cut trees. PGNs generalise out-of-distribution to 5x larger test inputs on dynamic graph connectivity tasks, outperforming unrestricted GNNs and Deep Sets.

Graph neural networks (GNNs) are effective machine learning models for various graph learning problems. Despite their empirical successes, the theoretical limitations of GNNs have been revealed recently. Consequently, many GNN models have been proposed to overcome these limitations. In this survey, we provide a comprehensive overview of the expressive power of GNNs and provably powerful variants of GNNs.

Graph Neural Networks (GNNs) for representation learning of graphs broadly follow a neighborhood aggregation framework, where the representation vector of a node is computed by recursively aggregating and transforming feature vectors of its neighboring nodes. Many GNN variants have been proposed and have achieved state-of-the-art results on both node and graph classification tasks. However, despite GNNs revolutionizing graph representation learning, there is limited understanding of their representational properties and limitations. Here, we present a theoretical framework for analyzing the expressive power of GNNs in capturing different graph structures. Our results characterize the discriminative power of popular GNN variants, such as Graph Convolutional Networks and GraphSAGE, and show that they cannot learn to distinguish certain simple graph structures. We then develop a simple architecture that is provably the most expressive among the class of GNNs and is as powerful as the Weisfeiler-Lehman graph isomorphism test. We empirically validate our theoretical findings on a number of graph classification benchmarks, and demonstrate that our model achieves state-of-the-art performance.

A fundamental computation for statistical inference and accurate decision-making is to compute the marginal probabilities or most probable states of task-relevant variables. Probabilistic graphical models can efficiently represent the structure of such complex data, but performing these inferences is generally difficult. Message-passing algorithms, such as belief propagation, are a natural way to disseminate evidence amongst correlated variables while exploiting the graph structure, but these algorithms can struggle when the conditional dependency graphs contain loops. Here we use Graph Neural Networks (GNNs) to learn a message-passing algorithm that solves these inference tasks. We first show that the architecture of GNNs is well-matched to inference tasks. We then demonstrate the efficacy of this inference approach by training GNNs on a collection of graphical models and showing that they substantially outperform belief propagation on loopy graphs. Our message-passing algorithms generalize out of the training set to larger graphs and graphs with different structure.

The robust and efficient recognition of visual relations in images is a hallmark of biological vision. Here, we argue that, despite recent progress in visual recognition, modern machine vision algorithms are severely limited in their ability to learn visual relations. Through controlled experiments, we demonstrate that visual-relation problems strain convolutional neural networks (CNNs). The networks eventually break altogether when rote memorization becomes impossible such as when the intra-class variability exceeds their capacity. We further show that another type of feedforward network, called a relational network (RN), which was shown to successfully solve seemingly difficult visual question answering (VQA) problems on the CLEVR datasets, suffers similar limitations. Motivated by the comparable success of biological vision, we argue that feedback mechanisms including working memory and attention are the key computational components underlying abstract visual reasoning.

北京阿比特科技有限公司