亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Training spiking neural networks to approximate complex functions is essential for studying information processing in the brain and neuromorphic computing. Yet, the binary nature of spikes constitutes a challenge for direct gradient-based training. To sidestep this problem, surrogate gradients have proven empirically successful, but their theoretical foundation remains elusive. Here, we investigate the relation of surrogate gradients to two theoretically well-founded approaches. On the one hand, we consider smoothed probabilistic models, which, due to lack of support for automatic differentiation, are impractical for training deep spiking neural networks, yet provide gradients equivalent to surrogate gradients in single neurons. On the other hand, we examine stochastic automatic differentiation, which is compatible with discrete randomness but has never been applied to spiking neural network training. We find that the latter provides the missing theoretical basis for surrogate gradients in stochastic spiking neural networks. We further show that surrogate gradients in deterministic networks correspond to a particular asymptotic case and numerically confirm the effectiveness of surrogate gradients in stochastic multi-layer spiking neural networks. Finally, we illustrate that surrogate gradients are not conservative fields and, thus, not gradients of a surrogate loss. Our work provides the missing theoretical foundation for surrogate gradients and an analytically well-founded solution for end-to-end training of stochastic spiking neural networks.

相關內容

神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)(jing)網(wang)(wang)(wang)絡(Neural Networks)是世界上三個(ge)(ge)最古老的(de)(de)(de)神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)(jing)建模學會(hui)(hui)(hui)(hui)的(de)(de)(de)檔案期(qi)刊(kan):國際神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)(jing)網(wang)(wang)(wang)絡學會(hui)(hui)(hui)(hui)(INNS)、歐(ou)洲神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)(jing)網(wang)(wang)(wang)絡學會(hui)(hui)(hui)(hui)(ENNS)和(he)(he)(he)(he)日本神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)(jing)網(wang)(wang)(wang)絡學會(hui)(hui)(hui)(hui)(JNNS)。神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)(jing)網(wang)(wang)(wang)絡提供了一(yi)個(ge)(ge)論(lun)壇,以發(fa)(fa)展和(he)(he)(he)(he)培育(yu)一(yi)個(ge)(ge)國際社(she)會(hui)(hui)(hui)(hui)的(de)(de)(de)學者(zhe)和(he)(he)(he)(he)實踐者(zhe)感興趣(qu)的(de)(de)(de)所有方面(mian)的(de)(de)(de)神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)(jing)網(wang)(wang)(wang)絡和(he)(he)(he)(he)相關(guan)方法的(de)(de)(de)計(ji)算(suan)(suan)智(zhi)能。神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)(jing)網(wang)(wang)(wang)絡歡迎高(gao)質量論(lun)文(wen)的(de)(de)(de)提交(jiao),有助于全面(mian)的(de)(de)(de)神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)(jing)網(wang)(wang)(wang)絡研究(jiu),從行為和(he)(he)(he)(he)大腦建模,學習算(suan)(suan)法,通過數學和(he)(he)(he)(he)計(ji)算(suan)(suan)分析,系統的(de)(de)(de)工程和(he)(he)(he)(he)技術應(ying)用(yong),大量使用(yong)神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)(jing)網(wang)(wang)(wang)絡的(de)(de)(de)概念和(he)(he)(he)(he)技術。這一(yi)獨特而廣泛的(de)(de)(de)范圍促(cu)進了生物(wu)和(he)(he)(he)(he)技術研究(jiu)之間(jian)的(de)(de)(de)思想交(jiao)流,并有助于促(cu)進對生物(wu)啟發(fa)(fa)的(de)(de)(de)計(ji)算(suan)(suan)智(zhi)能感興趣(qu)的(de)(de)(de)跨學科(ke)社(she)區的(de)(de)(de)發(fa)(fa)展。因此(ci),神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)(jing)網(wang)(wang)(wang)絡編(bian)委會(hui)(hui)(hui)(hui)代(dai)表(biao)(biao)的(de)(de)(de)專家領域包括心理學,神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)(jing)生物(wu)學,計(ji)算(suan)(suan)機科(ke)學,工程,數學,物(wu)理。該雜(za)志發(fa)(fa)表(biao)(biao)文(wen)章、信件和(he)(he)(he)(he)評(ping)論(lun)以及給(gei)編(bian)輯的(de)(de)(de)信件、社(she)論(lun)、時事、軟(ruan)件調查(cha)和(he)(he)(he)(he)專利信息。文(wen)章發(fa)(fa)表(biao)(biao)在五個(ge)(ge)部分之一(yi):認知科(ke)學,神(shen)(shen)(shen)經(jing)(jing)(jing)(jing)(jing)科(ke)學,學習系統,數學和(he)(he)(he)(he)計(ji)算(suan)(suan)分析、工程和(he)(he)(he)(he)應(ying)用(yong)。 官網(wang)(wang)(wang)地址(zhi):

The delimitation of biological species, i.e., deciding which individuals belong to the same species and whether and how many different species are represented in a data set, is key to the conservation of biodiversity. Much existing work uses only genetic data for species delimitation, often employing some kind of cluster analysis. This can be misleading, because geographically distant groups of individuals can be genetically quite different even if they belong to the same species. We investigate the problem of testing whether two potentially separated groups of individuals can belong to a single species or not based on genetic and spatial data. Existing methods such as the partial Mantel test and jackknife-based distance-distance regression are considered. New approaches, i.e., an adaptation of a mixed effects model, a bootstrap approach, and a jackknife version of partial Mantel, are proposed. All these methods address the issue that distance data violate the independence assumption for standard inference regarding correlation and regression; a standard linear regression is also considered. The approaches are compared on simulated meta-populations generated with SLiM and GSpace - two software packages that can simulate spatially-explicit genetic data at an individual level. Simulations show that the new jackknife version of the partial Mantel test provides a good compromise between power and respecting the nominal type I error rate. Mixed-effects models have larger power than jackknife-based methods, but tend to display type I error rates slightly above the significance level. An application on brassy ringlets concludes the paper.

Spiking neural networks play an important role in brain-like neuromorphic computations and in studying working mechanisms of neural circuits. One drawback of training a large scale spiking neural network is that updating all weights is quite expensive. Furthermore, after training, all information related to the computational task is hidden into the weight matrix, prohibiting us from a transparent understanding of circuit mechanisms. Therefore, in this work, we address these challenges by proposing a spiking mode-based training protocol, where the recurrent weight matrix is explained as a Hopfield-like multiplication of three matrices: input, output modes and a score matrix. The first advantage is that the weight is interpreted by input and output modes and their associated scores characterizing the importance of each decomposition term. The number of modes is thus adjustable, allowing more degrees of freedom for modeling the experimental data. This significantly reduces the training cost because of significantly reduced space complexity for learning. Training spiking networks is thus carried out in the mode-score space. The second advantage is that one can project the high dimensional neural activity (filtered spike train) in the state space onto the mode space which is typically of a low dimension, e.g., a few modes are sufficient to capture the shape of the underlying neural manifolds. We successfully apply our framework in two computational tasks -- digit classification and selective sensory integration tasks. Our method accelerate the training of spiking neural networks by a Hopfield-like decomposition, and moreover this training leads to low-dimensional attractor structures of high-dimensional neural dynamics.

Approximation of solutions to partial differential equations (PDE) is an important problem in computational science and engineering. Using neural networks as an ansatz for the solution has proven a challenge in terms of training time and approximation accuracy. In this contribution, we discuss how sampling the hidden weights and biases of the ansatz network from data-agnostic and data-dependent probability distributions allows us to progress on both challenges. In most examples, the random sampling schemes outperform iterative, gradient-based optimization of physics-informed neural networks regarding training time and accuracy by several orders of magnitude. For time-dependent PDE, we construct neural basis functions only in the spatial domain and then solve the associated ordinary differential equation with classical methods from scientific computing over a long time horizon. This alleviates one of the greatest challenges for neural PDE solvers because it does not require us to parameterize the solution in time. For second-order elliptic PDE in Barron spaces, we prove the existence of sampled networks with $L^2$ convergence to the solution. We demonstrate our approach on several time-dependent and static PDEs. We also illustrate how sampled networks can effectively solve inverse problems in this setting. Benefits compared to common numerical schemes include spectral convergence and mesh-free construction of basis functions.

Background: The aim of this study was to investigate the role of clinical, dosimetric and pretherapeutic magnetic resonance imaging (MRI) features for lesion-specific outcome prediction of stereotactic radiotherapy (SRT) in patients with brain metastases from malignant melanoma (MBM). Methods: In this multicenter, retrospective analysis, we reviewed 517 MBM from 130 patients treated with SRT (single fraction or hypofractionated). For each gross tumor volume (GTV) 1576 radiomic features (RF) were calculated (788 each for the GTV and for a 3 mm margin around the GTV). Clinical parameters, radiation dose and RF from pretherapeutic contrast-enhanced T1-weighted MRI from different institutions were evaluated with a feature processing and elimination pipeline in a nested cross-validation scheme. Results: Seventy-two (72) of 517 lesions (13.9%) showed a local failure (LF) after SRT. The processing pipeline showed clinical, dosimetric and radiomic features providing information for LF prediction. The most prominent ones were the correlation of the gray level co-occurrence matrix of the margin (hazard ratio (HR): 0.37, confidence interval (CI): 0.23-0.58) and systemic therapy before SRT (HR: 0.55, CI: 0.42-0.70). The majority of RF associated with LF was calculated in the margin around the GTV. Conclusions: Pretherapeutic MRI based RF connected with lesion-specific outcome after SRT could be identified, despite multicentric data and minor differences in imaging protocols. Image data analysis of the surrounding metastatic environment may provide therapy-relevant information with the potential to further individualize radiotherapy strategies.

This paper presents a novel approach for constructing graph neural networks equivariant to 2D rotations and translations and leveraging them as PDE surrogates on non-gridded domains. We show that aligning the representations with the principal axis allows us to sidestep many constraints while preserving SE(2) equivariance. By applying our model as a surrogate for fluid flow simulations and conducting thorough benchmarks against non-equivariant models, we demonstrate significant gains in terms of both data efficiency and accuracy.

Valid statistical inference is crucial for decision-making but difficult to obtain in supervised learning with multimodal data, e.g., combinations of clinical features, genomic data, and medical images. Multimodal data often warrants the use of black-box algorithms, for instance, random forests or neural networks, which impede the use of traditional variable significance tests. We address this problem by proposing the use of COvariance Measure Tests (COMETs), which are calibrated and powerful tests that can be combined with any sufficiently predictive supervised learning algorithm. We apply COMETs to several high-dimensional, multimodal data sets to illustrate (i) variable significance testing for finding relevant mutations modulating drug-activity, (ii) modality selection for predicting survival in liver cancer patients with multiomics data, and (iii) modality selection with clinical features and medical imaging data. In all applications, COMETs yield results consistent with domain knowledge without requiring data-driven pre-processing which may invalidate type I error control. These novel applications with high-dimensional multimodal data corroborate prior results on the power and robustness of COMETs for significance testing. COMETs are implemented in the comets R package available on CRAN and pycomets Python library available on GitHub. Source code for reproducing all results is available at //github.com/LucasKook/comets. All data sets used in this work are openly available.

We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.

We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.

The growing energy and performance costs of deep learning have driven the community to reduce the size of neural networks by selectively pruning components. Similarly to their biological counterparts, sparse networks generalize just as well, if not better than, the original dense networks. Sparsity can reduce the memory footprint of regular networks to fit mobile devices, as well as shorten training time for ever growing networks. In this paper, we survey prior work on sparsity in deep learning and provide an extensive tutorial of sparsification for both inference and training. We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice. Our work distills ideas from more than 300 research papers and provides guidance to practitioners who wish to utilize sparsity today, as well as to researchers whose goal is to push the frontier forward. We include the necessary background on mathematical methods in sparsification, describe phenomena such as early structure adaptation, the intricate relations between sparsity and the training process, and show techniques for achieving acceleration on real hardware. We also define a metric of pruned parameter efficiency that could serve as a baseline for comparison of different sparse networks. We close by speculating on how sparsity can improve future workloads and outline major open problems in the field.

Graph representation learning for hypergraphs can be used to extract patterns among higher-order interactions that are critically important in many real world problems. Current approaches designed for hypergraphs, however, are unable to handle different types of hypergraphs and are typically not generic for various learning tasks. Indeed, models that can predict variable-sized heterogeneous hyperedges have not been available. Here we develop a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes. We perform extensive evaluations on multiple datasets, including four benchmark network datasets and two single-cell Hi-C datasets in genomics. We demonstrate that Hyper-SAGNN significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification. Hyper-SAGNN will be useful for graph representation learning to uncover complex higher-order interactions in different applications.

北京阿比特科技有限公司