三级电影一区二区三区_日韩1区3区4区第一页_日本免费高清视频在线观看一区_真实国产熟女一区二区三区_国产激情电影综合在线看免费_日本国产一级二级三级不卡_亚洲综合视频一区二区三区

In this paper, we study random neural networks which are single-hidden-layer feedforward neural networks whose weights and biases are randomly initialized. After this random initialization, only the linear readout needs to be trained, which can be performed efficiently, e.g., by the least squares method. By viewing random neural networks as Banach space-valued random variables, we prove their universal approximation properties within suitable Bochner spaces. Hereby, the corresponding Banach space can be more general than the space of continuous functions over a compact subset of a Euclidean space, namely, e.g., an $L^p$-space or a Sobolev space, where the latter includes the approximation of the derivatives. Moreover, we derive some approximation rates and develop an explicit algorithm to learn a deterministic function by a random neural network. In addition, we provide a full error analysis and study when random neural networks overcome the curse of dimensionality in the sense that the training costs scale at most polynomially in the input and output dimension. Furthermore, we show in two numerical examples the empirical advantages of random neural networks compared to fully trained deterministic neural networks.

相關內容

Neural Networks

關注 1648

神(shen)經(jing)(jing)網(wang)(wang)絡(luo)（Neural Networks）是世界上三個最古老(lao)的(de)(de)(de)神(shen)經(jing)(jing)建模(mo)學(xue)(xue)(xue)(xue)會(hui)的(de)(de)(de)檔案期刊:國(guo)際神(shen)經(jing)(jing)網(wang)(wang)絡(luo)學(xue)(xue)(xue)(xue)會(hui)(INNS)、歐洲神(shen)經(jing)(jing)網(wang)(wang)絡(luo)學(xue)(xue)(xue)(xue)會(hui)(ENNS)和(he)(he)日(ri)本神(shen)經(jing)(jing)網(wang)(wang)絡(luo)學(xue)(xue)(xue)(xue)會(hui)(JNNS)。神(shen)經(jing)(jing)網(wang)(wang)絡(luo)提供了一個論壇，以發(fa)展(zhan)(zhan)和(he)(he)培育一個國(guo)際社(she)會(hui)的(de)(de)(de)學(xue)(xue)(xue)(xue)者和(he)(he)實踐者感興(xing)趣的(de)(de)(de)所有方面(mian)的(de)(de)(de)神(shen)經(jing)(jing)網(wang)(wang)絡(luo)和(he)(he)相關方法的(de)(de)(de)計(ji)(ji)算(suan)(suan)智(zhi)(zhi)能(neng)。神(shen)經(jing)(jing)網(wang)(wang)絡(luo)歡(huan)迎高質量論文(wen)的(de)(de)(de)提交(jiao)，有助(zhu)于(yu)全面(mian)的(de)(de)(de)神(shen)經(jing)(jing)網(wang)(wang)絡(luo)研(yan)究，從行為和(he)(he)大腦(nao)建模(mo)，學(xue)(xue)(xue)(xue)習算(suan)(suan)法，通過(guo)數(shu)學(xue)(xue)(xue)(xue)和(he)(he)計(ji)(ji)算(suan)(suan)分析(xi)，系(xi)(xi)統(tong)的(de)(de)(de)工(gong)程(cheng)和(he)(he)技(ji)術(shu)應(ying)用，大量使用神(shen)經(jing)(jing)網(wang)(wang)絡(luo)的(de)(de)(de)概念和(he)(he)技(ji)術(shu)。這(zhe)一獨特而(er)廣泛(fan)的(de)(de)(de)范圍促進了生(sheng)物(wu)和(he)(he)技(ji)術(shu)研(yan)究之間的(de)(de)(de)思想交(jiao)流(liu)，并有助(zhu)于(yu)促進對生(sheng)物(wu)啟發(fa)的(de)(de)(de)計(ji)(ji)算(suan)(suan)智(zhi)(zhi)能(neng)感興(xing)趣的(de)(de)(de)跨學(xue)(xue)(xue)(xue)科(ke)社(she)區的(de)(de)(de)發(fa)展(zhan)(zhan)。因此，神(shen)經(jing)(jing)網(wang)(wang)絡(luo)編(bian)委會(hui)代表的(de)(de)(de)專(zhuan)家領(ling)域包括心理(li)學(xue)(xue)(xue)(xue)，神(shen)經(jing)(jing)生(sheng)物(wu)學(xue)(xue)(xue)(xue)，計(ji)(ji)算(suan)(suan)機科(ke)學(xue)(xue)(xue)(xue)，工(gong)程(cheng)，數(shu)學(xue)(xue)(xue)(xue)，物(wu)理(li)。該雜志發(fa)表文(wen)章、信件(jian)和(he)(he)評(ping)論以及給編(bian)輯的(de)(de)(de)信件(jian)、社(she)論、時事、軟件(jian)調(diao)查和(he)(he)專(zhuan)利信息。文(wen)章發(fa)表在(zai)五個部分之一:認知科(ke)學(xue)(xue)(xue)(xue)，神(shen)經(jing)(jing)科(ke)學(xue)(xue)(xue)(xue)，學(xue)(xue)(xue)(xue)習系(xi)(xi)統(tong)，數(shu)學(xue)(xue)(xue)(xue)和(he)(he)計(ji)(ji)算(suan)(suan)分析(xi)、工(gong)程(cheng)和(he)(he)應(ying)用。官(guan)網(wang)(wang)地址：

SAS · 控制器 · INFORMS · 可約的 · 環 ·

2024 年 2 月 1 日

Formal Synthesis of Uncertainty Reduction Controllers

Marc Carwehl,Calum Imrie,Thomas Vogel,Genaína Rodrigues,Radu Calinescu,Lars Grunske

In its quest for approaches to taming uncertainty in self-adaptive systems (SAS), the research community has largely focused on solutions that adapt the SAS architecture or behaviour in response to uncertainty. By comparison, solutions that reduce the uncertainty affecting SAS (other than through the blanket monitoring of their components and environment) remain underexplored. Our paper proposes a more nuanced, adaptive approach to SAS uncertainty reduction. To that end, we introduce a SAS architecture comprising an uncertainty reduction controller that drives the adaptive acquisition of new information within the SAS adaptation loop, and a tool-supported method that uses probabilistic model checking to synthesise such controllers. The controllers generated by our method deliver optimal trade-offs between SAS uncertainty reduction benefits and new information acquisition costs. We illustrate the use and evaluate the effectiveness of our approach for mobile robot navigation and server infrastructure management SAS.

噪聲 · SGD · 隨機梯度下降 · Analysis · 小批量隨機 ·

2024 年 2 月 1 日

A Theoretical Analysis of Noise Geometry in Stochastic Gradient Descent

Mingze Wang,Lei Wu

from arxiv, 30 pages

In this paper, we provide a theoretical study of noise geometry for minibatch stochastic gradient descent (SGD), a phenomenon where noise aligns favorably with the geometry of local landscape. We propose two metrics, derived from analyzing how noise influences the loss and subspace projection dynamics, to quantify the alignment strength. We show that for (over-parameterized) linear models and two-layer nonlinear networks, when measured by these metrics, the alignment can be provably guaranteed under conditions independent of the degree of over-parameterization. To showcase the utility of our noise geometry characterizations, we present a refined analysis of the mechanism by which SGD escapes from sharp minima. We reveal that unlike gradient descent (GD), which escapes along the sharpest directions, SGD tends to escape from flatter directions and cyclical learning rates can exploit this SGD characteristic to navigate more effectively towards flatter regions. Lastly, extensive experiments are provided to support our theoretical findings.

Learning · Performer · 簇 · 類別 · 圖片分類 ·

2024 年 1 月 31 日

Towards Image Semantics and Syntax Sequence Learning

Chun Tao,Timur Ibrayev,Kaushik Roy

from arxiv, 21 pages, 22 figures, 5 tables

Convolutional neural networks and vision transformers have achieved outstanding performance in machine perception, particularly for image classification. Although these image classifiers excel at predicting image-level class labels, they may not discriminate missing or shifted parts within an object. As a result, they may fail to detect corrupted images that involve missing or disarrayed semantic information in the object composition. On the contrary, human perception easily distinguishes such corruptions. To mitigate this gap, we introduce the concept of "image grammar", consisting of "image semantics" and "image syntax", to denote the semantics of parts or patches of an image and the order in which these parts are arranged to create a meaningful object. To learn the image grammar relative to a class of visual objects/scenes, we propose a weakly supervised two-stage approach. In the first stage, we use a deep clustering framework that relies on iterative clustering and feature refinement to produce part-semantic segmentation. In the second stage, we incorporate a recurrent bi-LSTM module to process a sequence of semantic segmentation patches to capture the image syntax. Our framework is trained to reason over patch semantics and detect faulty syntax. We benchmark the performance of several grammar learning models in detecting patch corruptions. Finally, we verify the capabilities of our framework in Celeb and SUNRGBD datasets and demonstrate that it can achieve a grammar validation accuracy of 70 to 90% in a wide variety of semantic and syntactical corruption scenarios.

Minimax · Networking · Neural Networks · 優化器 · 泛函 ·

2024 年 1 月 30 日

Universal Consistency of Wide and Deep ReLU Neural Networks and Minimax Optimal Convergence Rates for Kolmogorov-Donoho Optimal Function Classes

Hyunouk Ko,Xiaoming Huo

In this paper, we prove the universal consistency of wide and deep ReLU neural network classifiers trained on the logistic loss. We also give sufficient conditions for a class of probability measures for which classifiers based on neural networks achieve minimax optimal rates of convergence. The result applies to a wide range of known function classes. In particular, while most previous works impose explicit smoothness assumptions on the regression function, our framework encompasses more general settings. The proposed neural networks are either the minimizers of the logistic loss or the $0$-$1$ loss. In the former case, they are interpolating classifiers that exhibit a benign overfitting behavior.

Networking · 學成 · Principle · MoDELS · Networks ·

2021 年 6 月 18 日

The Principles of Deep Learning Theory

Daniel A. Roberts,Sho Yaida,Boris Hanin

from arxiv, 451 pages, to be published by Cambridge University Press

This book develops an effective theory approach to understanding deep neural networks of practical relevance. Beginning from a first-principles component-level picture of networks, we explain how to determine an accurate description of the output of trained networks by solving layer-to-layer iteration equations and nonlinear learning dynamics. A main result is that the predictions of networks are described by nearly-Gaussian distributions, with the depth-to-width aspect ratio of the network controlling the deviations from the infinite-width Gaussian description. We explain how these effectively-deep networks learn nontrivial representations from training and more broadly analyze the mechanism of representation learning for nonlinear models. From a nearly-kernel-methods perspective, we find that the dependence of such models' predictions on the underlying learning algorithm can be expressed in a simple and universal way. To obtain these results, we develop the notion of representation group flow (RG flow) to characterize the propagation of signals through the network. By tuning networks to criticality, we give a practical solution to the exploding and vanishing gradient problem. We further explain how RG flow leads to near-universal behavior and lets us categorize networks built from different activation functions into universality classes. Altogether, we show that the depth-to-width ratio governs the effective model complexity of the ensemble of trained networks. By using information-theoretic techniques, we estimate the optimal aspect ratio at which we expect the network to be practically most useful and show how residual connections can be used to push this scale to arbitrary depths. With these tools, we can learn in detail about the inductive bias of architectures, hyperparameters, and optimizers.

圖形處理器 · 圖 · 可辨認的 · Neural Networks · Networking ·

2021 年 5 月 31 日

On Explainability of Graph Neural Networks via Subgraph Explorations

Hao Yuan,Haiyang Yu,Jie Wang,Kang Li,Shuiwang Ji

from arxiv, Accepted by ICML 2021

We consider the problem of explaining the predictions of graph neural networks (GNNs), which otherwise are considered as black boxes. Existing methods invariably focus on explaining the importance of graph nodes or edges but ignore the substructures of graphs, which are more intuitive and human-intelligible. In this work, we propose a novel method, known as SubgraphX, to explain GNNs by identifying important subgraphs. Given a trained GNN model and an input graph, our SubgraphX explains its predictions by efficiently exploring different subgraphs with Monte Carlo tree search. To make the tree search more effective, we propose to use Shapley values as a measure of subgraph importance, which can also capture the interactions among different subgraphs. To expedite computations, we propose efficient approximation schemes to compute Shapley values for graph data. Our work represents the first attempt to explain GNNs via identifying subgraphs explicitly and directly. Experimental results show that our SubgraphX achieves significantly improved explanations, while keeping computations at a reasonable level.

注意力機制 · 注意力模型 · MoDELS · Neural Networks · Taxonomy ·

2020 年 12 月 15 日

An Attentive Survey of Attention Models

Sneha Chaudhari,Varun Mithal,Gungor Polatkan,Rohan Ramanath

from arxiv, submitted to Transactions on Intelligent Systems and Technology(TIST); 20 pages

Attention Model has now become an important concept in neural networks that has been researched within diverse application domains. This survey provides a structured and comprehensive overview of the developments in modeling attention. In particular, we propose a taxonomy which groups existing techniques into coherent categories. We review salient neural architectures in which attention has been incorporated, and discuss applications in which modeling attention has shown a significant impact. Finally, we also describe how attention has been used to improve the interpretability of neural networks. We hope this survey will provide a succinct introduction to attention models and guide practitioners while developing approaches for their applications.

圖 · 學成 · state-of-the-art · GNN · 表示學習 ·

2018 年 6 月 26 日

Hierarchical Graph Representation Learning with Differentiable Pooling

Rex Ying,Jiaxuan You,Christopher Morris,Xiang Ren,William L. Hamilton,Jure Leskovec

Recently, graph neural networks (GNNs) have revolutionized the field of graph representation learning through effectively learned node embeddings, and achieved state-of-the-art results in tasks such as node classification and link prediction. However, current GNN methods are inherently flat and do not learn hierarchical representations of graphs---a limitation that is especially problematic for the task of graph classification, where the goal is to predict the label associated with an entire graph. Here we propose DiffPool, a differentiable graph pooling module that can generate hierarchical representations of graphs and can be combined with various graph neural network architectures in an end-to-end fashion. DiffPool learns a differentiable soft cluster assignment for nodes at each layer of a deep GNN, mapping nodes to a set of clusters, which then form the coarsened input for the next GNN layer. Our experimental results show that combining existing GNN methods with DiffPool yields an average improvement of 5-10% accuracy on graph classification benchmarks, compared to all existing pooling approaches, achieving a new state-of-the-art on four out of five benchmark data sets.

注意力機制 · 機器閱讀理解 · Extensibility · state-of-the-art · MoDELS ·

2018 年 4 月 25 日

Reinforced Mnemonic Reader for Machine Reading Comprehension

Minghao Hu,Yuxing Peng,Zhen Huang,Xipeng Qiu,Furu Wei,Ming Zhou

from arxiv, Published in 26th International Joint Conference on Artificial Intelligence (IJCAI), 2018

In this paper, we introduce the Reinforced Mnemonic Reader for machine reading comprehension tasks, which enhances previous attentive readers in two aspects. First, a reattention mechanism is proposed to refine current attentions by directly accessing to past attentions that are temporally memorized in a multi-round alignment architecture, so as to avoid the problems of attention redundancy and attention deficiency. Second, a new optimization approach, called dynamic-critical reinforcement learning, is introduced to extend the standard supervised method. It always encourages to predict a more acceptable answer so as to address the convergence suppression problem occurred in traditional reinforcement learning algorithms. Extensive experiments on the Stanford Question Answering Dataset (SQuAD) show that our model achieves state-of-the-art results. Meanwhile, our model outperforms previous systems by over 6% in terms of both Exact Match and F1 metrics on two adversarial SQuAD datasets.

卷積神經網絡 · Neural Networks · 知識表示 · Networking · 卷積 ·

2018 年 2 月 14 日

Interpretable Convolutional Neural Networks

Quanshi Zhang,Ying Nian Wu,Song-Chun Zhu

from arxiv, In this version, we release the website of the code. Compared to the previous version, we have corrected all values of location instability in Table 3--6 by dividing the values by sqrt(2), i.e., a=a/sqrt(2). Such revisions do NOT decrease the significance of the superior performance of our method, because we make the same correction to location-instability values of all baselines

This paper proposes a method to modify traditional convolutional neural networks (CNNs) into interpretable CNNs, in order to clarify knowledge representations in high conv-layers of CNNs. In an interpretable CNN, each filter in a high conv-layer represents a certain object part. We do not need any annotations of object parts or textures to supervise the learning process. Instead, the interpretable CNN automatically assigns each filter in a high conv-layer with an object part during the learning process. Our method can be applied to different types of CNNs with different structures. The clear knowledge representation in an interpretable CNN can help people understand the logics inside a CNN, i.e., based on which patterns the CNN makes the decision. Experiments showed that filters in an interpretable CNN were more semantically meaningful than those in traditional CNNs.