亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We use fixed point theory to analyze nonnegative neural networks, which we define as neural networks that map nonnegative vectors to nonnegative vectors. We first show that nonnegative neural networks with nonnegative weights and biases can be recognized as monotonic and (weakly) scalable functions within the framework of nonlinear Perron-Frobenius theory. This fact enables us to provide conditions for the existence of fixed points of nonnegative neural networks having inputs and outputs of the same dimension, and these conditions are weaker than those recently obtained using arguments in convex analysis. Furthermore, we prove that the shape of the fixed point set of nonnegative neural networks with nonnegative weights and biases is an interval, which under mild conditions degenerates to a point. These results are then used to obtain the existence of fixed points of more general nonnegative neural networks. From a practical perspective, our results contribute to the understanding of the behavior of autoencoders, and the main theoretical results are verified in numerical simulations using the Modified National Institute of Standards and Technology (MNIST) dataset.

相關內容

神(shen)(shen)(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)(Neural Networks)是(shi)世(shi)界上三個最古老的神(shen)(shen)(shen)(shen)經(jing)(jing)(jing)建(jian)模(mo)學(xue)會(hui)的檔案期刊:國際神(shen)(shen)(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)學(xue)會(hui)(INNS)、歐洲神(shen)(shen)(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)學(xue)會(hui)(ENNS)和日本神(shen)(shen)(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)學(xue)會(hui)(JNNS)。神(shen)(shen)(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)提供(gong)了一(yi)個論壇,以(yi)發(fa)(fa)展(zhan)和培(pei)育一(yi)個國際社會(hui)的學(xue)者和實踐(jian)者感(gan)興趣的所有(you)方面(mian)(mian)的神(shen)(shen)(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)和相關方法(fa)的計(ji)(ji)(ji)算智(zhi)能。神(shen)(shen)(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)歡迎高質量(liang)論文(wen)的提交,有(you)助于全(quan)面(mian)(mian)的神(shen)(shen)(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)研(yan)究(jiu),從行為和大腦建(jian)模(mo),學(xue)習算法(fa),通過數(shu)學(xue)和計(ji)(ji)(ji)算分析,系(xi)統(tong)(tong)的工程(cheng)和技術(shu)應(ying)用(yong),大量(liang)使用(yong)神(shen)(shen)(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)的概念和技術(shu)。這一(yi)獨特(te)而(er)廣泛的范圍(wei)促進了生物(wu)和技術(shu)研(yan)究(jiu)之(zhi)間(jian)的思想交流,并有(you)助于促進對生物(wu)啟(qi)發(fa)(fa)的計(ji)(ji)(ji)算智(zhi)能感(gan)興趣的跨學(xue)科(ke)社區(qu)的發(fa)(fa)展(zhan)。因此,神(shen)(shen)(shen)(shen)經(jing)(jing)(jing)網(wang)(wang)(wang)絡(luo)編(bian)委會(hui)代表的專家領(ling)域包括(kuo)心理(li)(li)學(xue),神(shen)(shen)(shen)(shen)經(jing)(jing)(jing)生物(wu)學(xue),計(ji)(ji)(ji)算機(ji)科(ke)學(xue),工程(cheng),數(shu)學(xue),物(wu)理(li)(li)。該雜志(zhi)發(fa)(fa)表文(wen)章、信件和評論以(yi)及給編(bian)輯的信件、社論、時事(shi)、軟件調查和專利信息。文(wen)章發(fa)(fa)表在五(wu)個部分之(zhi)一(yi):認知科(ke)學(xue),神(shen)(shen)(shen)(shen)經(jing)(jing)(jing)科(ke)學(xue),學(xue)習系(xi)統(tong)(tong),數(shu)學(xue)和計(ji)(ji)(ji)算分析、工程(cheng)和應(ying)用(yong)。 官網(wang)(wang)(wang)地址:

Threshold selection is a fundamental problem in any threshold-based extreme value analysis. While models are asymptotically motivated, selecting an appropriate threshold for finite samples can be difficult through standard methods. Inference can also be highly sensitive to the choice of threshold. Too low a threshold choice leads to bias in the fit of the extreme value model, while too high a choice leads to unnecessary additional uncertainty in the estimation of model parameters. In this paper, we develop a novel methodology for automated threshold selection that directly tackles this bias-variance trade-off. We also develop a method to account for the uncertainty in this threshold choice and propagate this uncertainty through to high quantile inference. Through a simulation study, we demonstrate the effectiveness of our method for threshold selection and subsequent extreme quantile estimation. We apply our method to the well-known, troublesome example of the River Nidd dataset.

The paper's goal is to provide a simple unified approach to perform sensitivity analysis using Physics-informed neural networks (PINN). The main idea lies in adding a new term in the loss function that regularizes the solution in a small neighborhood near the nominal value of the parameter of interest. The added term represents the derivative of the loss function with respect to the parameter of interest. The result of this modification is a solution to the problem along with the derivative of the solution with respect to the parameter of interest (the sensitivity). We call the new technique to perform sensitivity analysis within this context SA-PINN. We show the effectiveness of the technique using 3 examples: the first one is a simple 1D advection-diffusion problem to show the methodology, the second is a 2D Poisson's problem with 9 parameters of interest and the last one is a transient two-phase flow in porous media problem.

Recent advances in deep learning have given us some very promising results on the generalization ability of deep neural networks, however literature still lacks a comprehensive theory explaining why heavily over-parametrized models are able to generalize well while fitting the training data. In this paper we propose a PAC type bound on the generalization error of feedforward ReLU networks via estimating the Rademacher complexity of the set of networks available from an initial parameter vector via gradient descent. The key idea is to bound the sensitivity of the network's gradient to perturbation of the input data along the optimization trajectory. The obtained bound does not explicitly depend on the depth of the network. Our results are experimentally verified on the MNIST and CIFAR-10 datasets.

The success of over-parameterized neural networks trained to near-zero training error has caused great interest in the phenomenon of benign overfitting, where estimators are statistically consistent even though they interpolate noisy training data. While benign overfitting in fixed dimension has been established for some learning methods, current literature suggests that for regression with typical kernel methods and wide neural networks, benign overfitting requires a high-dimensional setting where the dimension grows with the sample size. In this paper, we show that the smoothness of the estimators, and not the dimension, is the key: benign overfitting is possible if and only if the estimator's derivatives are large enough. We generalize existing inconsistency results to non-interpolating models and more kernels to show that benign overfitting with moderate derivatives is impossible in fixed dimension. Conversely, we show that rate-optimal benign overfitting is possible for regression with a sequence of spiky-smooth kernels with large derivatives. Using neural tangent kernels, we translate our results to wide neural networks. We prove that while infinite-width networks do not overfit benignly with the ReLU activation, this can be fixed by adding small high-frequency fluctuations to the activation function. Our experiments verify that such neural networks, while overfitting, can indeed generalize well even on low-dimensional data sets.

When multiple self-adaptive systems share the same environment and have common goals, they may coordinate their adaptations at runtime to avoid conflicts and to satisfy their goals. There are two approaches to coordination. (1) Logically centralized, where a supervisor has complete control over the individual self-adaptive systems. Such approach is infeasible when the systems have different owners or administrative domains. (2) Logically decentralized, where coordination is achieved through direct interactions. Because the individual systems have control over the information they share, decentralized coordination accommodates multiple administrative domains. However, existing techniques do not account simultaneously for both local concerns, e.g., preferences, and shared concerns, e.g., conflicts, which may lead to goals not being achieved as expected. Our idea to address this shortcoming is to express both types of concerns within the same constraint optimization problem. We propose CoADAPT, a decentralized coordination technique introducing two types of constraints: preference constraints, expressing local concerns, and consistency constraints, expressing shared concerns. At runtime, the problem is solved in a decentralized way using distributed constraint optimization algorithms implemented by each self-adaptive system. As a first step in realizing CoADAPT, we focus in this work on the coordination of adaptation planning strategies, traditionally addressed only with centralized techniques. We show the feasibility of CoADAPT in an exemplar from cloud computing and analyze experimentally its scalability.

We study the training dynamics of shallow neural networks, in a two-timescale regime in which the stepsizes for the inner layer are much smaller than those for the outer layer. In this regime, we prove convergence of the gradient flow to a global optimum of the non-convex optimization problem in a simple univariate setting. The number of neurons need not be asymptotically large for our result to hold, distinguishing our result from popular recent approaches such as the neural tangent kernel or mean-field regimes. Experimental illustration is provided, showing that the stochastic gradient descent behaves according to our description of the gradient flow and thus converges to a global optimum in the two-timescale regime, but can fail outside of this regime.

The prevailing statistical approach to analyzing persistence diagrams is concerned with filtering out topological noise. In this paper, we adopt a different viewpoint and aim at estimating the actual distribution of a random persistence diagram, which captures both topological signal and noise. To that effect, Chazel and Divol (2019) proved that, under general conditions, the expected value of a random persistence diagram is a measure admitting a Lebesgue density, called the persistence intensity function. In this paper, we are concerned with estimating the persistence intensity function and a novel, normalized version of it -- called the persistence density function. We present a class of kernel-based estimators based on an i.i.d. sample of persistence diagrams and derive estimation rates in the supremum norm. As a direct corollary, we obtain uniform consistency rates for estimating linear representations of persistence diagrams, including Betti numbers and persistence surfaces. Interestingly, the persistence density function delivers stronger statistical guarantees.

Feature attribution is a fundamental task in both machine learning and data analysis, which involves determining the contribution of individual features or variables to a model's output. This process helps identify the most important features for predicting an outcome. The history of feature attribution methods can be traced back to General Additive Models (GAMs), which extend linear regression models by incorporating non-linear relationships between dependent and independent variables. In recent years, gradient-based methods and surrogate models have been applied to unravel complex Artificial Intelligence (AI) systems, but these methods have limitations. GAMs tend to achieve lower accuracy, gradient-based methods can be difficult to interpret, and surrogate models often suffer from stability and fidelity issues. Furthermore, most existing methods do not consider users' contexts, which can significantly influence their preferences. To address these limitations and advance the current state-of-the-art, we define a novel feature attribution framework called Context-Aware Feature Attribution Through Argumentation (CA-FATA). Our framework harnesses the power of argumentation by treating each feature as an argument that can either support, attack or neutralize a prediction. Additionally, CA-FATA formulates feature attribution as an argumentation procedure, and each computation has explicit semantics, which makes it inherently interpretable. CA-FATA also easily integrates side information, such as users' contexts, resulting in more accurate predictions.

We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.

Recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of volumetric images. In this work, we show that a multi-class 3D FCN trained on manually labeled CT scans of several anatomical structures (ranging from the large organs to thin vessels) can achieve competitive segmentation results, while avoiding the need for handcrafting features or training class-specific models. To this end, we propose a two-stage, coarse-to-fine approach that will first use a 3D FCN to roughly define a candidate region, which will then be used as input to a second 3D FCN. This reduces the number of voxels the second FCN has to classify to ~10% and allows it to focus on more detailed segmentation of the organs and vessels. We utilize training and validation sets consisting of 331 clinical CT images and test our models on a completely unseen data collection acquired at a different hospital that includes 150 CT scans, targeting three anatomical organs (liver, spleen, and pancreas). In challenging organs such as the pancreas, our cascaded approach improves the mean Dice score from 68.5 to 82.2%, achieving the highest reported average score on this dataset. We compare with a 2D FCN method on a separate dataset of 240 CT scans with 18 classes and achieve a significantly higher performance in small organs and vessels. Furthermore, we explore fine-tuning our models to different datasets. Our experiments illustrate the promise and robustness of current 3D FCN based semantic segmentation of medical images, achieving state-of-the-art results. Our code and trained models are available for download: //github.com/holgerroth/3Dunet_abdomen_cascade.

北京阿比特科技有限公司