免费在线黄色电影,欧美亚州视频一区二区三区,亚洲AV第二区国产精品,在线不卡一区视频免费观看

Neural network with quadratic decision functions have been introduced as alternatives to standard neural networks with affine linear one. They are advantageous when the objects to be identified are of compact basic geometries like circles, ellipsis etc. In this paper we investigate the use of such ansatz functions for classification. In particular we test and compare the algorithm on the MNIST dataset for classification of handwritten digits and for classification of subspecies. We also show, that the implementation can be based on the neural network structure in the software Tensorflow and Keras, respectively.

相關內容

Networking

關注 22

Networking：IFIP International Conferences on Networking。 Explanation：國際網絡會議。 Publisher：IFIP。 SIT：

SGD · 線性的 · Networking · 廣義線性模型 · 線性模型 ·

2024 年 3 月 1 日

Escaping mediocrity: how two-layer networks learn hard generalized linear models with SGD

Luca Arnaboldi,Florent Krzakala,Bruno Loureiro,Ludovic Stephan

This study explores the sample complexity for two-layer neural networks to learn a generalized linear target function under Stochastic Gradient Descent (SGD), focusing on the challenging regime where many flat directions are present at initialization. It is well-established that in this scenario $n=O(d \log d)$ samples are typically needed. However, we provide precise results concerning the pre-factors in high-dimensional contexts and for varying widths. Notably, our findings suggest that overparameterization can only enhance convergence by a constant factor within this problem class. These insights are grounded in the reduction of SGD dynamics to a stochastic process in lower dimensions, where escaping mediocrity equates to calculating an exit time. Yet, we demonstrate that a deterministic approximation of this process adequately represents the escape time, implying that the role of stochasticity may be minimal in this scenario.

Networking · INTERACT · 樣本 · MoDELS · 簇 ·

2024 年 3 月 1 日

Disentangling the structure of ecological bipartite networks from observation processes

Emre Anakok,Pierre Barbillon,Colin Fontaine,Elisa Thebault

The structure of a bipartite interaction network can be described by providing a clustering for each of the two types of nodes. Such clusterings are outputted by fitting a Latent Block Model (LBM) on an observed network that comes from a sampling of species interactions in the field. However, the sampling is limited and possibly uneven. This may jeopardize the fit of the LBM and then the description of the structure of the network by detecting structures which result from the sampling and not from actual underlying ecological phenomena. If the observed interaction network consists of a weighted bipartite network where the number of observed interactions between two species is available, the sampling efforts for all species can be estimated and used to correct the LBM fit. We propose to combine an observation model that accounts for sampling and an LBM for describing the structure of underlying possible ecological interactions. We develop an original inference procedure for this model, the efficiency of which is demonstrated on simulation studies. The pratical interest in ecology of our model is highlighted on a large dataset of plant-pollinator network.

均值 · 方差 · 塑造 · 統計理論 ·

2024 年 3 月 1 日

Testing mean and variance by e-processes

Yixuan Fan,Zhanyi Jiao,Ruodu Wang

from arxiv, 26 pages, 4 figures, 3 tables

We address the problem of testing conditional mean and conditional variance for non-stationary data. We build e-values and p-values for four types of non-parametric composite hypotheses with specified mean and variance as well as other conditions on the shape of the data-generating distribution. These shape conditions include symmetry, unimodality, and their combination. Using the obtained e-values and p-values, we construct tests via e-processes, also known as testing by betting, as well as some tests based on combining p-values for comparison. Although we mainly focus on one-sided tests, the two-sided test for the mean is also studied. Simulation and empirical studies are conducted under a few settings, and they illustrate features of the methods based on e-processes.

Performer · Networking · Neural Networks · 監督 · Less ·

2024 年 3 月 1 日

Improving the performance of weak supervision searches using transfer and meta-learning

Hugues Beauchesne,Zong-En Chen,Cheng-Wei Chiang

from arxiv, 20 pages, 7 figures, matches the published version

Weak supervision searches have in principle the advantages of both being able to train on experimental data and being able to learn distinctive signal properties. However, the practical applicability of such searches is limited by the fact that successfully training a neural network via weak supervision can require a large amount of signal. In this work, we seek to create neural networks that can learn from less experimental signal by using transfer and meta-learning. The general idea is to first train a neural network on simulations, thereby learning concepts that can be reused or becoming a more efficient learner. The neural network would then be trained on experimental data and should require less signal because of its previous training. We find that transfer and meta-learning can substantially improve the performance of weak supervision searches.

Networking · Neural Networks · 近似 · 無偏估計 · 估計/估計量 ·

2024 年 3 月 1 日

Approximating intractable short ratemodel distribution with neural network

Anna Knezevic,Nikolai Dokuchaev

from arxiv, Mistake in methodology

We propose an algorithm which predicts each subsequent time step relative to the previous timestep of intractable short rate model (when adjusted for drift and overall distribution of previous percentile result) and show that the method achieves superior outcomes to the unbiased estimate both on the trained dataset and different validation data.

Networking · 優化器 · 向量化 · Weight · 邊 ·

2024 年 2 月 29 日

More algorithmic results for problems of spread of influence in edge-weighted graphs with and without incentives

Siavash Askari,Manouchehr Zaker

Many phenomena in real world social networks are interpreted as spread of influence between activated and non-activated network elements. These phenomena are formulated by combinatorial graphs, where vertices represent the elements and edges represent social ties between elements. A main problem is to study important subsets of elements (target sets or dynamic monopolies) such that their activation spreads to the entire network. In edge-weighted networks the influence between two adjacent vertices depends on the weight of their edge. In models with incentives, the main problem is to minimize total amount of incentives (called optimal target vectors) which can be offered to vertices such that some vertices are activated and their activation spreads to the whole network. Algorithmic study of target sets and vectors is a hot research field. We prove an inapproximability result for optimal target sets in edge weighted networks even for complete graphs. Some other hardness and polynomial time results are presented for optimal target vectors and degenerate threshold assignments in edge-weighted networks.

Neural Networks · 統計量 · Networking · 泛化誤差 · Learning ·

2024 年 2 月 29 日

The committee machine: Computational to statistical gaps in learning a two-layers neural network

Benjamin Aubin,Antoine Maillard,Jean Barbier,Florent Krzakala,Nicolas Macris,Lenka Zdeborová

from arxiv, 18 pages + supplementary material, 3 figures. (v2: update to match the published version ; v3: clarification of the caption of Fig. 3)

Heuristic tools from statistical physics have been used in the past to locate the phase transitions and compute the optimal learning and generalization errors in the teacher-student scenario in multi-layer neural networks. In this contribution, we provide a rigorous justification of these approaches for a two-layers neural network model called the committee machine. We also introduce a version of the approximate message passing (AMP) algorithm for the committee machine that allows to perform optimal learning in polynomial time for a large set of parameters. We find that there are regimes in which a low generalization error is information-theoretically achievable while the AMP algorithm fails to deliver it, strongly suggesting that no efficient algorithm exists for those cases, and unveiling a large computational gap.

線性的 · 優化器 · 穩健性 · 約束 · 噪聲 ·

2024 年 2 月 28 日

Linear shrinkage for optimization in high dimensions

Naqi Huang,Nestor Parolya,Thereisa van Essen

In large-scale, data-driven applications, parameters are often only known approximately due to noise and limited data samples. In this paper, we focus on high-dimensional optimization problems with linear constraints under uncertain conditions. To find high quality solutions for which the violation of the true constraints is limited, we develop a linear shrinkage method that blends random matrix theory and robust optimization principles. It aims to minimize the Frobenius distance between the estimated and the true parameter matrix, especially when dealing with a large and comparable number of constraints and variables. This data-driven method excels in simulations, showing superior noise resilience and more stable performance in both obtaining high quality solutions and adhering to the true constraints compared to traditional robust optimization. Our findings highlight the effectiveness of our method in improving the robustness and reliability of optimization in high-dimensional, data-driven scenarios.

貪心 · 模態 · MoDELS · 學成 · 泛化理論 ·

2022 年 2 月 10 日

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

Nan Wu,Stanis?aw Jastrz?bski,Kyunghyun Cho,Krzysztof J. Geras

We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.

FCN · 全卷積網絡 · 3D · 級聯 · MoDELS ·

2018 年 3 月 20 日

An application of cascaded 3D fully convolutional networks for medical image segmentation

Holger R. Roth,Hirohisa Oda,Xiangrong Zhou,Natsuki Shimizu,Ying Yang,Yuichiro Hayashi,Masahiro Oda,Michitaka Fujiwara,Kazunari Misawa,Kensaku Mori

from arxiv, Preprint accepted for publication in Computerized Medical Imaging and Graphics. Substantial extension of arXiv:1704.06382; Corrected references to figure numbers in this version

Recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of volumetric images. In this work, we show that a multi-class 3D FCN trained on manually labeled CT scans of several anatomical structures (ranging from the large organs to thin vessels) can achieve competitive segmentation results, while avoiding the need for handcrafting features or training class-specific models. To this end, we propose a two-stage, coarse-to-fine approach that will first use a 3D FCN to roughly define a candidate region, which will then be used as input to a second 3D FCN. This reduces the number of voxels the second FCN has to classify to ~10% and allows it to focus on more detailed segmentation of the organs and vessels. We utilize training and validation sets consisting of 331 clinical CT images and test our models on a completely unseen data collection acquired at a different hospital that includes 150 CT scans, targeting three anatomical organs (liver, spleen, and pancreas). In challenging organs such as the pancreas, our cascaded approach improves the mean Dice score from 68.5 to 82.2%, achieving the highest reported average score on this dataset. We compare with a 2D FCN method on a separate dataset of 240 CT scans with 18 classes and achieve a significantly higher performance in small organs and vessels. Furthermore, we explore fine-tuning our models to different datasets. Our experiments illustrate the promise and robustness of current 3D FCN based semantic segmentation of medical images, achieving state-of-the-art results. Our code and trained models are available for download: //github.com/holgerroth/3Dunet_abdomen_cascade.