亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In this paper, we study random neural networks which are single-hidden-layer feedforward neural networks whose weights and biases are randomly initialized. After this random initialization, only the linear readout needs to be trained, which can be performed efficiently, e.g., by the least squares method. By viewing random neural networks as Banach space-valued random variables, we prove their universal approximation properties within suitable Bochner spaces. Hereby, the corresponding Banach space can be more general than the space of continuous functions over a compact subset of a Euclidean space, namely, e.g., an $L^p$-space or a Sobolev space, where the latter includes the approximation of the derivatives. Moreover, we derive some approximation rates and develop an explicit algorithm to learn a deterministic function by a random neural network. In addition, we provide a full error analysis and study when random neural networks overcome the curse of dimensionality in the sense that the training costs scale at most polynomially in the input and output dimension. Furthermore, we show in two numerical examples the empirical advantages of random neural networks compared to fully trained deterministic neural networks.

相關內容

神(shen)經(jing)(jing)網(wang)(wang)絡(luo)(Neural Networks)是世界上三個最古老(lao)的(de)(de)(de)神(shen)經(jing)(jing)建模(mo)學(xue)(xue)(xue)(xue)會(hui)的(de)(de)(de)檔案期刊:國(guo)際神(shen)經(jing)(jing)網(wang)(wang)絡(luo)學(xue)(xue)(xue)(xue)會(hui)(INNS)、歐洲神(shen)經(jing)(jing)網(wang)(wang)絡(luo)學(xue)(xue)(xue)(xue)會(hui)(ENNS)和(he)(he)日(ri)本神(shen)經(jing)(jing)網(wang)(wang)絡(luo)學(xue)(xue)(xue)(xue)會(hui)(JNNS)。神(shen)經(jing)(jing)網(wang)(wang)絡(luo)提供了一個論壇,以發(fa)展(zhan)(zhan)和(he)(he)培育一個國(guo)際社(she)會(hui)的(de)(de)(de)學(xue)(xue)(xue)(xue)者和(he)(he)實踐者感興(xing)趣的(de)(de)(de)所有方面(mian)的(de)(de)(de)神(shen)經(jing)(jing)網(wang)(wang)絡(luo)和(he)(he)相關方法的(de)(de)(de)計(ji)(ji)算(suan)(suan)智(zhi)(zhi)能(neng)。神(shen)經(jing)(jing)網(wang)(wang)絡(luo)歡(huan)迎高質量論文(wen)的(de)(de)(de)提交(jiao),有助(zhu)于(yu)全面(mian)的(de)(de)(de)神(shen)經(jing)(jing)網(wang)(wang)絡(luo)研(yan)究,從行為和(he)(he)大腦(nao)建模(mo),學(xue)(xue)(xue)(xue)習算(suan)(suan)法,通過(guo)數(shu)學(xue)(xue)(xue)(xue)和(he)(he)計(ji)(ji)算(suan)(suan)分析(xi),系(xi)(xi)統(tong)的(de)(de)(de)工(gong)程(cheng)和(he)(he)技(ji)術(shu)應(ying)用,大量使用神(shen)經(jing)(jing)網(wang)(wang)絡(luo)的(de)(de)(de)概念和(he)(he)技(ji)術(shu)。這(zhe)一獨特而(er)廣泛(fan)的(de)(de)(de)范圍促進了生(sheng)物(wu)和(he)(he)技(ji)術(shu)研(yan)究之間的(de)(de)(de)思想交(jiao)流(liu),并有助(zhu)于(yu)促進對生(sheng)物(wu)啟發(fa)的(de)(de)(de)計(ji)(ji)算(suan)(suan)智(zhi)(zhi)能(neng)感興(xing)趣的(de)(de)(de)跨學(xue)(xue)(xue)(xue)科(ke)社(she)區的(de)(de)(de)發(fa)展(zhan)(zhan)。因此,神(shen)經(jing)(jing)網(wang)(wang)絡(luo)編(bian)委會(hui)代表的(de)(de)(de)專(zhuan)家領(ling)域包括心理(li)學(xue)(xue)(xue)(xue),神(shen)經(jing)(jing)生(sheng)物(wu)學(xue)(xue)(xue)(xue),計(ji)(ji)算(suan)(suan)機科(ke)學(xue)(xue)(xue)(xue),工(gong)程(cheng),數(shu)學(xue)(xue)(xue)(xue),物(wu)理(li)。該雜志發(fa)表文(wen)章、信件(jian)和(he)(he)評(ping)論以及給編(bian)輯的(de)(de)(de)信件(jian)、社(she)論、時事、軟件(jian)調(diao)查和(he)(he)專(zhuan)利信息。文(wen)章發(fa)表在(zai)五個部分之一:認知科(ke)學(xue)(xue)(xue)(xue),神(shen)經(jing)(jing)科(ke)學(xue)(xue)(xue)(xue),學(xue)(xue)(xue)(xue)習系(xi)(xi)統(tong),數(shu)學(xue)(xue)(xue)(xue)和(he)(he)計(ji)(ji)算(suan)(suan)分析(xi)、工(gong)程(cheng)和(he)(he)應(ying)用。 官(guan)網(wang)(wang)地址:

In its quest for approaches to taming uncertainty in self-adaptive systems (SAS), the research community has largely focused on solutions that adapt the SAS architecture or behaviour in response to uncertainty. By comparison, solutions that reduce the uncertainty affecting SAS (other than through the blanket monitoring of their components and environment) remain underexplored. Our paper proposes a more nuanced, adaptive approach to SAS uncertainty reduction. To that end, we introduce a SAS architecture comprising an uncertainty reduction controller that drives the adaptive acquisition of new information within the SAS adaptation loop, and a tool-supported method that uses probabilistic model checking to synthesise such controllers. The controllers generated by our method deliver optimal trade-offs between SAS uncertainty reduction benefits and new information acquisition costs. We illustrate the use and evaluate the effectiveness of our approach for mobile robot navigation and server infrastructure management SAS.

In this paper, we provide a theoretical study of noise geometry for minibatch stochastic gradient descent (SGD), a phenomenon where noise aligns favorably with the geometry of local landscape. We propose two metrics, derived from analyzing how noise influences the loss and subspace projection dynamics, to quantify the alignment strength. We show that for (over-parameterized) linear models and two-layer nonlinear networks, when measured by these metrics, the alignment can be provably guaranteed under conditions independent of the degree of over-parameterization. To showcase the utility of our noise geometry characterizations, we present a refined analysis of the mechanism by which SGD escapes from sharp minima. We reveal that unlike gradient descent (GD), which escapes along the sharpest directions, SGD tends to escape from flatter directions and cyclical learning rates can exploit this SGD characteristic to navigate more effectively towards flatter regions. Lastly, extensive experiments are provided to support our theoretical findings.

Convolutional neural networks and vision transformers have achieved outstanding performance in machine perception, particularly for image classification. Although these image classifiers excel at predicting image-level class labels, they may not discriminate missing or shifted parts within an object. As a result, they may fail to detect corrupted images that involve missing or disarrayed semantic information in the object composition. On the contrary, human perception easily distinguishes such corruptions. To mitigate this gap, we introduce the concept of "image grammar", consisting of "image semantics" and "image syntax", to denote the semantics of parts or patches of an image and the order in which these parts are arranged to create a meaningful object. To learn the image grammar relative to a class of visual objects/scenes, we propose a weakly supervised two-stage approach. In the first stage, we use a deep clustering framework that relies on iterative clustering and feature refinement to produce part-semantic segmentation. In the second stage, we incorporate a recurrent bi-LSTM module to process a sequence of semantic segmentation patches to capture the image syntax. Our framework is trained to reason over patch semantics and detect faulty syntax. We benchmark the performance of several grammar learning models in detecting patch corruptions. Finally, we verify the capabilities of our framework in Celeb and SUNRGBD datasets and demonstrate that it can achieve a grammar validation accuracy of 70 to 90% in a wide variety of semantic and syntactical corruption scenarios.

In this paper, we prove the universal consistency of wide and deep ReLU neural network classifiers trained on the logistic loss. We also give sufficient conditions for a class of probability measures for which classifiers based on neural networks achieve minimax optimal rates of convergence. The result applies to a wide range of known function classes. In particular, while most previous works impose explicit smoothness assumptions on the regression function, our framework encompasses more general settings. The proposed neural networks are either the minimizers of the logistic loss or the $0$-$1$ loss. In the former case, they are interpolating classifiers that exhibit a benign overfitting behavior.

This book develops an effective theory approach to understanding deep neural networks of practical relevance. Beginning from a first-principles component-level picture of networks, we explain how to determine an accurate description of the output of trained networks by solving layer-to-layer iteration equations and nonlinear learning dynamics. A main result is that the predictions of networks are described by nearly-Gaussian distributions, with the depth-to-width aspect ratio of the network controlling the deviations from the infinite-width Gaussian description. We explain how these effectively-deep networks learn nontrivial representations from training and more broadly analyze the mechanism of representation learning for nonlinear models. From a nearly-kernel-methods perspective, we find that the dependence of such models' predictions on the underlying learning algorithm can be expressed in a simple and universal way. To obtain these results, we develop the notion of representation group flow (RG flow) to characterize the propagation of signals through the network. By tuning networks to criticality, we give a practical solution to the exploding and vanishing gradient problem. We further explain how RG flow leads to near-universal behavior and lets us categorize networks built from different activation functions into universality classes. Altogether, we show that the depth-to-width ratio governs the effective model complexity of the ensemble of trained networks. By using information-theoretic techniques, we estimate the optimal aspect ratio at which we expect the network to be practically most useful and show how residual connections can be used to push this scale to arbitrary depths. With these tools, we can learn in detail about the inductive bias of architectures, hyperparameters, and optimizers.

We consider the problem of explaining the predictions of graph neural networks (GNNs), which otherwise are considered as black boxes. Existing methods invariably focus on explaining the importance of graph nodes or edges but ignore the substructures of graphs, which are more intuitive and human-intelligible. In this work, we propose a novel method, known as SubgraphX, to explain GNNs by identifying important subgraphs. Given a trained GNN model and an input graph, our SubgraphX explains its predictions by efficiently exploring different subgraphs with Monte Carlo tree search. To make the tree search more effective, we propose to use Shapley values as a measure of subgraph importance, which can also capture the interactions among different subgraphs. To expedite computations, we propose efficient approximation schemes to compute Shapley values for graph data. Our work represents the first attempt to explain GNNs via identifying subgraphs explicitly and directly. Experimental results show that our SubgraphX achieves significantly improved explanations, while keeping computations at a reasonable level.

Attention Model has now become an important concept in neural networks that has been researched within diverse application domains. This survey provides a structured and comprehensive overview of the developments in modeling attention. In particular, we propose a taxonomy which groups existing techniques into coherent categories. We review salient neural architectures in which attention has been incorporated, and discuss applications in which modeling attention has shown a significant impact. Finally, we also describe how attention has been used to improve the interpretability of neural networks. We hope this survey will provide a succinct introduction to attention models and guide practitioners while developing approaches for their applications.

Recently, graph neural networks (GNNs) have revolutionized the field of graph representation learning through effectively learned node embeddings, and achieved state-of-the-art results in tasks such as node classification and link prediction. However, current GNN methods are inherently flat and do not learn hierarchical representations of graphs---a limitation that is especially problematic for the task of graph classification, where the goal is to predict the label associated with an entire graph. Here we propose DiffPool, a differentiable graph pooling module that can generate hierarchical representations of graphs and can be combined with various graph neural network architectures in an end-to-end fashion. DiffPool learns a differentiable soft cluster assignment for nodes at each layer of a deep GNN, mapping nodes to a set of clusters, which then form the coarsened input for the next GNN layer. Our experimental results show that combining existing GNN methods with DiffPool yields an average improvement of 5-10% accuracy on graph classification benchmarks, compared to all existing pooling approaches, achieving a new state-of-the-art on four out of five benchmark data sets.

In this paper, we introduce the Reinforced Mnemonic Reader for machine reading comprehension tasks, which enhances previous attentive readers in two aspects. First, a reattention mechanism is proposed to refine current attentions by directly accessing to past attentions that are temporally memorized in a multi-round alignment architecture, so as to avoid the problems of attention redundancy and attention deficiency. Second, a new optimization approach, called dynamic-critical reinforcement learning, is introduced to extend the standard supervised method. It always encourages to predict a more acceptable answer so as to address the convergence suppression problem occurred in traditional reinforcement learning algorithms. Extensive experiments on the Stanford Question Answering Dataset (SQuAD) show that our model achieves state-of-the-art results. Meanwhile, our model outperforms previous systems by over 6% in terms of both Exact Match and F1 metrics on two adversarial SQuAD datasets.

This paper proposes a method to modify traditional convolutional neural networks (CNNs) into interpretable CNNs, in order to clarify knowledge representations in high conv-layers of CNNs. In an interpretable CNN, each filter in a high conv-layer represents a certain object part. We do not need any annotations of object parts or textures to supervise the learning process. Instead, the interpretable CNN automatically assigns each filter in a high conv-layer with an object part during the learning process. Our method can be applied to different types of CNNs with different structures. The clear knowledge representation in an interpretable CNN can help people understand the logics inside a CNN, i.e., based on which patterns the CNN makes the decision. Experiments showed that filters in an interpretable CNN were more semantically meaningful than those in traditional CNNs.

北京阿比特科技有限公司