亚洲黄色网站不卡免费_黄色网站欧美黄色大片_亚洲国产极品专区在线观看_国产的又大又硬又爽又黄的视频_天堂在线无码在线_黄色视频免费观看在线_精品一久久香蕉国产线看播放

from arxiv, This paper is submitted to Phys. Rev. X. It's a physicist's study that focus on a new paradigm for deep learning networks. We would have liked to choose other keywords for arXiv to reach a wider community, but don't have the rights to do so

We design and analyze a new paradigm for building supervised learning networks, driven only by local optimization rules without relying on a global error function. Traditional neural networks with a fixed topology are made up of identical nodes and derive their expressiveness from an appropriate adjustment of connection weights. In contrast, our network stores new knowledge in the nodes accurately and instantaneously, in the form of a lookup table. Only then is some of this information structured and incorporated into the network geometry. The training error is initially zero by construction and remains so throughout the network topology transformation phase. The latter involves a small number of local topological transformations, such as splitting or merging of nodes and adding binary connections between them. The choice of operations to be carried out is only driven by optimization of expressivity at the local scale. What we are primarily looking for in a learning network is its ability to generalize, i.e. its capacity to correctly answer questions for which it has never learned the answers. We show on numerous examples of classification tasks that the networks generated by our algorithm systematically reach such a state of perfect generalization when the number of learned examples becomes sufficiently large. We report on the dynamics of the change of state and show that it is abrupt and has the distinctive characteristics of a first order phase transition, a phenomenon already observed for traditional learning networks and known as grokking. In addition to proposing a non-potential approach for the construction of learning networks, our algorithm makes it possible to rethink the grokking transition in a new light, under which acquisition of training data and topological structuring of data are completely decoupled phenomena.

相關內容

Networking

關注 22

Networking：IFIP International Conferences on Networking。 Explanation：國際網絡會議。 Publisher：IFIP。 SIT：

Learning · 深度學習 · 目標函數 · 泛函 · 全局優化 ·

2024 年 11 月 11 日

Permutative redundancy and uncertainty of the objective in deep learning

Vacslav Glukhov

from arxiv, 22 pages, 3 figures

Implications of uncertain objective functions and permutative symmetry of traditional deep learning architectures are discussed. It is shown that traditional architectures are polluted by an astronomical number of equivalent global and local optima. Uncertainty of the objective makes local optima unattainable, and, as the size of the network grows, the global optimization landscape likely becomes a tangled web of valleys and ridges. Some remedies which reduce or eliminate ghost optima are discussed including forced pre-pruning, re-ordering, ortho-polynomial activations, and modular bio-inspired architectures.

Networking · binary · Performer · 神經元 · 模型評估 ·

2024 年 11 月 9 日

Polariton lattices as binarized neuromorphic networks

Evgeny Sedov,Alexey Kavokin

We introduce a novel neuromorphic network architecture based on a lattice of exciton-polariton condensates, intricately interconnected and energized through non-resonant optical pumping. The network employs a binary framework, where each neuron, facilitated by the spatial coherence of pairwise coupled condensates, performs binary operations. This coherence, emerging from the ballistic propagation of polaritons, ensures efficient, network-wide communication. The binary neuron switching mechanism, driven by the nonlinear repulsion through the excitonic component of polaritons, offers computational efficiency and scalability advantages over continuous weight neural networks. Our network enables parallel processing, enhancing computational speed compared to sequential or pulse-coded binary systems. The system's performance was evaluated using diverse datasets, including the MNIST dataset for image recognition and the Speech Commands dataset for voice recognition tasks. In both scenarios, the proposed system demonstrates the potential to outperform existing polaritonic neuromorphic systems. For image recognition, this is evidenced by an impressive predicted classification accuracy of up to 97.5%. In voice recognition, the system achieved a classification accuracy of about 68\% for the ten-class subset, surpassing the performance of conventional benchmark, the Hidden Markov Model with Gaussian Mixture Model.

GROUP · 優化器 · Facebook AI Research · Learning · Pivotal（公司） ·

2024 年 11 月 8 日

Group-blind optimal transport to group parity and its constrained variants

Quan Zhou,Jakub Marecek

Fairness holds a pivotal role in the realm of machine learning, particularly when it comes to addressing groups categorised by protected attributes, e.g., gender, race. Prevailing algorithms in fair learning predominantly hinge on accessibility or estimations of these protected attributes, at least in the training process. We design a single group-blind projection map that aligns the feature distributions of both groups in the source data, achieving (demographic) group parity, without requiring values of the protected attribute for individual samples in the computation of the map, as well as its use. Instead, our approach utilises the feature distributions of the privileged and unprivileged groups in a boarder population and the essential assumption that the source data are unbiased representation of the population. We present numerical results on synthetic data and real data.

Neural Networks · Networking · 求逆 · 樣本 · 樣本復雜度 ·

2024 年 11 月 8 日

The sampling complexity of learning invertible residual neural networks

Yuanyuan Li,Philipp Grohs,Philipp Petersen

In recent work it has been shown that determining a feedforward ReLU neural network to within high uniform accuracy from point samples suffers from the curse of dimensionality in terms of the number of samples needed. As a consequence, feedforward ReLU neural networks are of limited use for applications where guaranteed high uniform accuracy is required. We consider the question of whether the sampling complexity can be improved by restricting the specific neural network architecture. To this end, we investigate invertible residual neural networks which are foundational architectures in deep learning and are widely employed in models that power modern generative methods. Our main result shows that the residual neural network architecture and invertibility do not help overcome the complexity barriers encountered with simpler feedforward architectures. Specifically, we demonstrate that the computational complexity of approximating invertible residual neural networks from point samples in the uniform norm suffers from the curse of dimensionality. Similar results are established for invertible convolutional Residual neural networks.

分離的 · 簇 · 泛函 · Better · 代價 ·

2024 年 11 月 7 日

On the cohesion and separability of average-link for hierarchical agglomerative clustering

Eduardo Sany Laber,Miguel Bastista

from arxiv, Accepted to Neurips 2024

Average-link is widely recognized as one of the most popular and effective methods for building hierarchical agglomerative clustering. The available theoretical analyses show that this method has a much better approximation than other popular heuristics, as single-linkage and complete-linkage, regarding variants of Dasgupta's cost function [STOC 2016]. However, these analyses do not separate average-link from a random hierarchy and they are not appealing for metric spaces since every hierarchical clustering has a 1/2 approximation with regard to the variant of Dasgupta's function that is employed for dissimilarity measures [Moseley and Yang 2020]. In this paper, we present a comprehensive study of the performance of average-link in metric spaces, regarding several natural criteria that capture separability and cohesion and are more interpretable than Dasgupta's cost function and its variants. We also present experimental results with real datasets that, together with our theoretical analyses, suggest that average-link is a better choice than other related methods when both cohesion and separability are important goals.

估計/估計量 · 核化 · 損失 · 損失函數（機器學習） · 泛函 ·

2024 年 11 月 7 日

An efficient likelihood-free Bayesian inference method based on sequential neural posterior estimation

Yifei Xiong,Xiliang Yang,Sanguo Zhang,Zhijian He

from arxiv, 28 pages, 9 figures

Sequential neural posterior estimation (SNPE) techniques have been recently proposed for dealing with simulation-based models with intractable likelihoods. Unlike approximate Bayesian computation, SNPE techniques learn the posterior from sequential simulation using neural network-based conditional density estimators by minimizing a specific loss function. The SNPE method proposed by Lueckmann et al. (2017) used a calibration kernel to boost the sample weights around the observed data, resulting in a concentrated loss function. However, the use of calibration kernels may increase the variances of both the empirical loss and its gradient, making the training inefficient. To improve the stability of SNPE, this paper proposes to use an adaptive calibration kernel and several variance reduction techniques. The proposed method greatly speeds up the process of training and provides a better approximation of the posterior than the original SNPE method and some existing competitors as confirmed by numerical experiments. We also manage to demonstrate the superiority of the proposed method for a high-dimensional model with real-world dataset.

Networking · Neural Networks · ReLU · SimPLe · 情景 ·

2024 年 11 月 7 日

Topological obstruction to the training of shallow ReLU neural networks

Marco Nurisso,Pierrick Leroy,Francesco Vaccarino

from arxiv, 23 pages, 5 figures, Conference on Neural Information Processing Systems (NeurIPS 2024)

Studying the interplay between the geometry of the loss landscape and the optimization trajectories of simple neural networks is a fundamental step for understanding their behavior in more complex settings. This paper reveals the presence of topological obstruction in the loss landscape of shallow ReLU neural networks trained using gradient flow. We discuss how the homogeneous nature of the ReLU activation function constrains the training trajectories to lie on a product of quadric hypersurfaces whose shape depends on the particular initialization of the network's parameters. When the neural network's output is a single scalar, we prove that these quadrics can have multiple connected components, limiting the set of reachable parameters during training. We analytically compute the number of these components and discuss the possibility of mapping one to the other through neuron rescaling and permutation. In this simple setting, we find that the non-connectedness results in a topological obstruction, which, depending on the initialization, can make the global optimum unreachable. We validate this result with numerical experiments.

穩健性 · 約束 · 統計量 · 估計/估計量 · Learning ·

2024 年 11 月 7 日

Distributionally robust risk evaluation with an isotonic constraint

Yu Gui,Rina Foygel Barber,Cong Ma

Statistical learning under distribution shift is challenging when neither prior knowledge nor fully accessible data from the target distribution is available. Distributionally robust learning (DRL) aims to control the worst-case statistical performance within an uncertainty set of candidate distributions, but how to properly specify the set remains challenging. To enable distributional robustness without being overly conservative, in this paper, we propose a shape-constrained approach to DRL, which incorporates prior information about the way in which the unknown target distribution differs from its estimate. More specifically, we assume the unknown density ratio between the target distribution and its estimate is isotonic with respect to some partial order. At the population level, we provide a solution to the shape-constrained optimization problem that does not involve the isotonic constraint. At the sample level, we provide consistency results for an empirical estimator of the target in a range of different settings. Empirical studies on both synthetic and real data examples demonstrate the improved accuracy of the proposed shape-constrained approach.

Networking · Neural Networks · 模型評估 · 試驗 · 內積 ·

2024 年 11 月 7 日

Compatible finite element interpolated neural networks

Santiago Badia,Wei Li,Alberto F. Martín

We extend the finite element interpolated neural network (FEINN) framework from partial differential equations (PDEs) with weak solutions in $H^1$ to PDEs with weak solutions in $H(\textbf{curl})$ or $H(\textbf{div})$. To this end, we consider interpolation trial spaces that satisfy the de Rham Hilbert subcomplex, providing stable and structure-preserving neural network discretisations for a wide variety of PDEs. This approach, coined compatible FEINNs, has been used to accurately approximate the $H(\textbf{curl})$ inner product. We numerically observe that the trained network outperforms finite element solutions by several orders of magnitude for smooth analytical solutions. Furthermore, to showcase the versatility of the method, we demonstrate that compatible FEINNs achieve high accuracy in solving surface PDEs such as the Darcy equation on a sphere. Additionally, the framework can integrate adaptive mesh refinements to effectively solve problems with localised features. We use an adaptive training strategy to train the network on a sequence of progressively adapted meshes. Finally, we compare compatible FEINNs with the adjoint neural network method for solving inverse problems. We consider a one-loop algorithm that trains the neural networks for unknowns and missing parameters using a loss function that includes PDE residual and data misfit terms. The algorithm is applied to identify space-varying physical parameters for the $H(\textbf{curl})$ model problem from partial or noisy observations. We find that compatible FEINNs achieve accuracy and robustness comparable to, if not exceeding, the adjoint method in these scenarios.

優化器 · 資源管理 · 非線性規劃 · Extensibility · Performer ·

2024 年 11 月 7 日

Joint wireless and computing resource management with optimal slice selection in in-network-edge metaverse system

Sulaiman Muhammad Rashid,Ibrahim Aliyu,Abubakar Isah,Jihoon Lee,Sangwon Oh,Minsoo Hahn,Jinsul Kim

This paper presents an approach to joint wireless and computing resource management in slice-enabled metaverse networks, addressing the challenges of inter-slice and intra-slice resource allocation in the presence of in-network computing. We formulate the problem as a mixed-integer nonlinear programming (MINLP) problem and derive an optimal solution using standard optimization techniques. Through extensive simulations, we demonstrate that our proposed method significantly improves system performance by effectively balancing the allocation of radio and computing resources across multiple slices. Our approach outperforms existing benchmarks, particularly in scenarios with high user demand and varying computational tasks.