亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<tfoot id='dpHjk'></tfoot>

<legend id='uEgKf'><style id='8Sej4'><dir id='C4ZUG'><q id='prp2N'></q></dir></style></legend>

<i id='xmzg6'><tr id='BM42S'><dt id='lKxcR'><q id='B2drG'><span id='tU32x'><b id='XPyjQ'><form id='RgwXB'><ins id='UclyA'></ins><ul id='iX85L'></ul><sub id='fYslu'></sub></form><legend id='tcPXV'></legend><bdo id='NtHTi'><pre id='VnUla'><center id='enJK7'></center></pre></bdo></b><th id='yQP2y'></th></span></q></dt></tr></i><div id='MxYzq'><tfoot id='ZvrBj'></tfoot><dl id='ya6Yv'><fieldset id='NOHWG'></fieldset></dl></div>

<li id='0PJkQ'><abbr id='DjuLR'></abbr></li>

·

Neural Networks · Networking · 泛函 · 可理解性 · 輸入分布 ·

2021 年 11 月 6 日

Understanding Layer-wise Contributions in Deep Neural Networks through Spectral Analysis

Yatin Dandi,Arthur Jacot

Spectral analysis is a powerful tool, decomposing any function into simpler parts. In machine learning, Mercer's theorem generalizes this idea, providing for any kernel and input distribution a natural basis of functions of increasing frequency. More recently, several works have extended this analysis to deep neural networks through the framework of Neural Tangent Kernel. In this work, we analyze the layer-wise spectral bias of Deep Neural Networks and relate it to the contributions of different layers in the reduction of generalization error for a given target function. We utilize the properties of Hermite polynomials and spherical harmonics to prove that initial layers exhibit a larger bias towards high-frequency functions defined on the unit sphere. We further provide empirical results validating our theory in high dimensional datasets for Deep Neural Networks.

相關內容

Neural Networks

Neural Networks

神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)（Neural Networks）是世界上三個(ge)最古(gu)老(lao)的(de)(de)(de)(de)神(shen)(shen)經(jing)(jing)(jing)建(jian)模(mo)學(xue)(xue)會的(de)(de)(de)(de)檔案期刊(kan):國際神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)學(xue)(xue)會(INNS)、歐洲神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)學(xue)(xue)會(ENNS)和(he)(he)(he)(he)(he)日本(ben)神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)學(xue)(xue)會(JNNS)。神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)提(ti)供了(le)一(yi)個(ge)論(lun)(lun)壇(tan)，以(yi)(yi)發(fa)展和(he)(he)(he)(he)(he)培育一(yi)個(ge)國際社會的(de)(de)(de)(de)學(xue)(xue)者(zhe)和(he)(he)(he)(he)(he)實(shi)踐者(zhe)感興趣的(de)(de)(de)(de)所(suo)有(you)方面的(de)(de)(de)(de)神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)和(he)(he)(he)(he)(he)相關方法(fa)的(de)(de)(de)(de)計(ji)算(suan)智(zhi)能。神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)歡迎高質量(liang)論(lun)(lun)文的(de)(de)(de)(de)提(ti)交(jiao)，有(you)助(zhu)于(yu)全面的(de)(de)(de)(de)神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)研究，從(cong)行為和(he)(he)(he)(he)(he)大腦(nao)建(jian)模(mo)，學(xue)(xue)習算(suan)法(fa)，通過數(shu)學(xue)(xue)和(he)(he)(he)(he)(he)計(ji)算(suan)分(fen)(fen)析，系(xi)統的(de)(de)(de)(de)工(gong)程和(he)(he)(he)(he)(he)技術(shu)應用，大量(liang)使用神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)的(de)(de)(de)(de)概(gai)念和(he)(he)(he)(he)(he)技術(shu)。這(zhe)一(yi)獨特(te)而廣泛的(de)(de)(de)(de)范圍促進了(le)生(sheng)(sheng)物和(he)(he)(he)(he)(he)技術(shu)研究之間的(de)(de)(de)(de)思想交(jiao)流，并有(you)助(zhu)于(yu)促進對生(sheng)(sheng)物啟發(fa)的(de)(de)(de)(de)計(ji)算(suan)智(zhi)能感興趣的(de)(de)(de)(de)跨(kua)學(xue)(xue)科(ke)社區的(de)(de)(de)(de)發(fa)展。因此(ci)，神(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)絡(luo)(luo)編委會代表(biao)(biao)的(de)(de)(de)(de)專家領域包括心理學(xue)(xue)，神(shen)(shen)經(jing)(jing)(jing)生(sheng)(sheng)物學(xue)(xue)，計(ji)算(suan)機(ji)科(ke)學(xue)(xue)，工(gong)程，數(shu)學(xue)(xue)，物理。該雜志發(fa)表(biao)(biao)文章(zhang)、信(xin)件和(he)(he)(he)(he)(he)評論(lun)(lun)以(yi)(yi)及給編輯的(de)(de)(de)(de)信(xin)件、社論(lun)(lun)、時事(shi)、軟件調查和(he)(he)(he)(he)(he)專利信(xin)息。文章(zhang)發(fa)表(biao)(biao)在五個(ge)部(bu)分(fen)(fen)之一(yi):認知科(ke)學(xue)(xue)，神(shen)(shen)經(jing)(jing)(jing)科(ke)學(xue)(xue)，學(xue)(xue)習系(xi)統，數(shu)學(xue)(xue)和(he)(he)(he)(he)(he)計(ji)算(suan)分(fen)(fen)析、工(gong)程和(he)(he)(he)(he)(he)應用。官網(wang)(wang)地址：

Neural Networks · 近似 · Networking · 泛函 · UniFormer ·

2022 年 1 月 11 日

Deep Neural Network Approximation For H?lder Functions

Ahmed Abdeljawad

from arxiv, 22 pages, 2 figures. This is the first version. Some revisions in the near future is expected to be performed

In this work, we explore the approximation capability of deep Rectified Quadratic Unit neural networks for H\"older-regular functions, with respect to the uniform norm. We find that theoretical approximation heavily depends on the selected activation function in the neural network.

縮放 · 蒙特卡羅 · 圖像修復 · 樣本 · Oracle ·

2022 年 1 月 7 日

PAC-Bayesian Matrix Completion with a Spectral Scaled Student Prior

We study the problem of matrix completion in this paper. A spectral scaled Student prior is exploited to favour the underlying low-rank structure of the data matrix. We provide a thorough theoretical investigation for our approach through PAC-Bayesian bounds. More precisely, our PAC-Bayesian approach enjoys a minimax-optimal oracle inequality which guarantees that our method works well under model misspecification and under general sampling distribution. Interestingly, we also provide efficient gradient-based sampling implementations for our approach by using Langevin Monte Carlo. More specifically, we show that our algorithms are significantly faster than Gibbs sampler in this problem. To illustrate the attractive features of our inference strategy, some numerical simulations are conducted and an application to image inpainting is demonstrated.

曲率 · PCA · 泛函 · 流形 · 點云 ·

2022 年 1 月 7 日

Curvature of point clouds through principal component analysis

Yasuhiko Asao,Yuichi Ike

from arxiv, 23 pages, 9 figures, v2 minor revision

In this article, we study curvature-like feature value of data sets in Euclidean spaces. First, we formulate such curvature functions with desirable properties under the manifold hypothesis. Then we make a test property for the validity of the curvature function by the law of large numbers, and check it for the function we construct by numerical experiments. These experiments also suggest the conjecture that the mean of the curvature of sample manifolds coincides with the curvature of the mean manifold. Our construction is based on the dimension estimation by the principal component analysis and the Gaussian curvature of hypersurfaces. Our function depends on provisional parameters $\varepsilon, \delta$, and we suggest dealing with the resulting functions as a function of these parameters to get some robustness. As an application, we propose a method to decompose data sets into some parts reflecting local structure. For this, we embed the data sets into higher dimensional Euclidean space using curvature values and cluster them in the embedding space. We also give some computational experiments that support the effectiveness of our methods.

簇 · 塊 · MoDELS · 近似誤差 · 估計誤差 ·

2022 年 1 月 6 日

Randomized Spectral Clustering in Large-Scale Stochastic Block Models

Hai Zhang,Xiao Guo,Xiangyu Chang

Spectral clustering has been one of the widely used methods for community detection in networks. However, large-scale networks bring computational challenges to the eigenvalue decomposition therein. In this paper, we study the spectral clustering using randomized sketching algorithms from a statistical perspective, where we typically assume the network data are generated from a stochastic block model that is not necessarily of full rank. To do this, we first use the recently developed sketching algorithms to obtain two randomized spectral clustering algorithms, namely, the random projection-based and the random sampling-based spectral clustering. Then we study the theoretical bounds of the resulting algorithms in terms of the approximation error for the population adjacency matrix, the misclassification error, and the estimation error for the link probability matrix. It turns out that, under mild conditions, the randomized spectral clustering algorithms lead to the same theoretical bounds as those of the original spectral clustering algorithm. We also extend the results to degree-corrected stochastic block models. Numerical experiments support our theoretical findings and show the efficiency of randomized methods. A new R package called Rclust is developed and made available to the public.

Fisher信息矩陣 · INFORMS · ReLU · 向量化 · 隱藏層 ·

2022 年 1 月 5 日

Approximate Spectral Decomposition of Fisher Information Matrix for Simple ReLU Networks

Yoshinari Takeishi,Masazumi Iida,Jun'ichi Takeuchi

We argue the Fisher information matrix (FIM) of one hidden layer networks with the ReLU activation function. Let $W$ denote the $d \times p$ weight matrix from the $d$-dimensional input to the hidden layer consisting of $p$ neurons, and $v$ the $p$-dimensional weight vector from the hidden layer to the scalar output. We focus on the FIM of $v$, which we denote as $I$. When $p$ is large, under certain conditions, the following approximately holds. 1) There are three major clusters in the eigenvalue distribution. 2) Since $I$ is non-negative owing to the ReLU, the first eigenvalue is the Perron-Frobenius eigenvalue. 3) For the cluster of the next maximum values, the eigenspace is spanned by the row vectors of $W$. 4) The direct sum of the eigenspace of the first eigenvalue and that of the third cluster is spanned by the set of all the vectors obtained as the Hadamard product of any pair of the row vectors of $W$. We confirmed by numerical simulation that the above is approximately correct when the number of hidden nodes is about 10000.

Neural Networks · 通用近似器 · Networking · INFORMS · MNIST (數據集) ·

2022 年 1 月 4 日

Neural Piecewise-Constant Delay Differential Equations

Qunxi Zhu,Yifei Shen,Dongsheng Li,Wei Lin

from arxiv, 9pages, 5 figures, accepted in AAAI 2022

Continuous-depth neural networks, such as the Neural Ordinary Differential Equations (ODEs), have aroused a great deal of interest from the communities of machine learning and data science in recent years, which bridge the connection between deep neural networks and dynamical systems. In this article, we introduce a new sort of continuous-depth neural network, called the Neural Piecewise-Constant Delay Differential Equations (PCDDEs). Here, unlike the recently proposed framework of the Neural Delay Differential Equations (DDEs), we transform the single delay into the piecewise-constant delay(s). The Neural PCDDEs with such a transformation, on one hand, inherit the strength of universal approximating capability in Neural DDEs. On the other hand, the Neural PCDDEs, leveraging the contributions of the information from the multiple previous time steps, further promote the modeling capability without augmenting the network dimension. With such a promotion, we show that the Neural PCDDEs do outperform the several existing continuous-depth neural frameworks on the one-dimensional piecewise-constant delay population dynamics and real-world datasets, including MNIST, CIFAR10, and SVHN.

ConvNets · 穩健性 · 卷積 · Weight · Networking ·

2022 年 1 月 4 日

Convolutional Normalization: Improving Deep Convolutional Network Robustness and Training

Sheng Liu,Xiao Li,Yuexiang Zhai,Chong You,Zhihui Zhu,Carlos Fernandez-Granda,Qing Qu

from arxiv, SL and XL contributed equally to this work; 23 pages, 6 figures, 6 tables, published in NeurIPS'21

Normalization techniques have become a basic component in modern convolutional neural networks (ConvNets). In particular, many recent works demonstrate that promoting the orthogonality of the weights helps train deep models and improve robustness. For ConvNets, most existing methods are based on penalizing or normalizing weight matrices derived from concatenating or flattening the convolutional kernels. These methods often destroy or ignore the benign convolutional structure of the kernels; therefore, they are often expensive or impractical for deep ConvNets. In contrast, we introduce a simple and efficient "Convolutional Normalization" (ConvNorm) method that can fully exploit the convolutional structure in the Fourier domain and serve as a simple plug-and-play module to be conveniently incorporated into any ConvNets. Our method is inspired by recent work on preconditioning methods for convolutional sparse coding and can effectively promote each layer's channel-wise isometry. Furthermore, we show that our ConvNorm can reduce the layerwise spectral norm of the weight matrices and hence improve the Lipschitzness of the network, leading to easier training and improved robustness for deep ConvNets. Applied to classification under noise corruptions and generative adversarial network (GAN), we show that the ConvNorm improves the robustness of common ConvNets such as ResNet and the performance of GAN. We verify our findings via numerical experiments on CIFAR and ImageNet.

Performer · 超參數 · Neural Networks · 情景 · Networking ·

2019 年 8 月 25 日

A Sensitivity Analysis of Attention-Gated Convolutional Neural Networks for Sentence Classification

Yang Liu,Jianpeng Zhang,Chao Gao,Jinghua Qu,Lixin Ji

Recently, Attention-Gated Convolutional Neural Networks (AGCNNs) perform well on several essential sentence classification tasks and show robust performance in practical applications. However, AGCNNs are required to set many hyperparameters, and it is not known how sensitive the model's performance changes with them. In this paper, we conduct a sensitivity analysis on the effect of different hyperparameters s of AGCNNs, e.g., the kernel window size and the number of feature maps. Also, we investigate the effect of different combinations of hyperparameters settings on the model's performance to analyze to what extent different parameters settings contribute to AGCNNs' performance. Meanwhile, we draw practical advice from a wide range of empirical results. Through the sensitivity analysis experiment, we improve the hyperparameters settings of AGCNNs. Experiments show that our proposals achieve an average of 0.81% and 0.67% improvements on AGCNN-NLReLU-rand and AGCNN-SELU-rand, respectively; and an average of 0.47% and 0.45% improvements on AGCNN-NLReLU-static and AGCNN-SELU-static, respectively.

核化 · 卷積神經網絡 · Processing（編程語言） · 卷積 · Networking ·

2018 年 8 月 16 日

Deep Convolutional Networks as shallow Gaussian Processes

Adrià Garriga-Alonso,Laurence Aitchison,Carl Edward Rasmussen

We show that the output of a (residual) convolutional neural network (CNN) with an appropriate prior over the weights and biases is a Gaussian process (GP) in the limit of infinitely many convolutional filters, extending similar results for dense networks. For a CNN, the equivalent kernel can be computed exactly and, unlike "deep kernels", has very few parameters: only the hyperparameters of the original CNN. Further, we show that this kernel has two properties that allow it to be computed efficiently; the cost of evaluating the kernel for a pair of images is similar to a single forward pass through the original CNN with only one filter per layer. The kernel equivalent to a 32-layer ResNet obtains 0.84% classification error on MNIST, a new record for GPs with a comparable number of parameters.

流形 · 可理解性 · 整流線性 · 學成 · 深度學習 ·

2018 年 5 月 31 日

Geometric Understanding of Deep Learning

Na Lei,Zhongxuan Luo,Shing-Tung Yau,David Xianfeng Gu

Deep learning is the mainstream technique for many machine learning tasks, including image recognition, machine translation, speech recognition, and so on. It has outperformed conventional methods in various fields and achieved great successes. Unfortunately, the understanding on how it works remains unclear. It has the central importance to lay down the theoretic foundation for deep learning. In this work, we give a geometric view to understand deep learning: we show that the fundamental principle attributing to the success is the manifold structure in data, namely natural high dimensional data concentrates close to a low-dimensional manifold, deep learning learns the manifold and the probability distribution on it. We further introduce the concepts of rectified linear complexity for deep neural network measuring its learning capability, rectified linear complexity of an embedding manifold describing the difficulty to be learned. Then we show for any deep neural network with fixed architecture, there exists a manifold that cannot be learned by the network. Finally, we propose to apply optimal mass transportation theory to control the probability distribution in the latent space.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

Neural Networks

輸(shu)入分布

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tr id='qFSTr'><strong id='dxIrS'></strong><small id='dxMbB'></small><button id='edMQW'></button><li id='wuJgd'><noscript id='xevZA'><big id='RLZdl'></big><dt id='CQKX0'></dt></noscript></li></tr><ol id='zYAR1'><option id='MD2SN'><table id='FGhef'><blockquote id='67LP5'><tbody id='0cT1b'></tbody></blockquote></table></option></ol><u id='CHIZ0'></u><kbd id='yDYRx'><kbd id='H43qB'></kbd></kbd>

<code id='fI7WF'><strong id='2U9C5'></strong></code>

<fieldset id='82v32'></fieldset>

<span id='nEYNx'></span>

<ins id='GCZKf'></ins>

<acronym id='RNPO6'><em id='OJydW'></em><td id='hF4DU'><div id='KsoG7'></div></td></acronym><address id='QqMl4'><big id='9mDXu'><big id='tFpin'></big><legend id='m9QUZ'></legend></big></address>

<i id='QKCpR'><div id='60utb'><ins id='We1d7'></ins></div></i>

<i id='pBAxG'></i>