韩国成年性午夜免费视频-亚洲人色大成年网站在线观看

Unraveling the emergence of collective learning in systems of coupled artificial neural networks points to broader implications for machine learning, neuroscience, and society. Here we introduce a minimal model that condenses several recent decentralized algorithms by considering a competition between two terms: the local learning dynamics in the parameters of each neural network unit, and a diffusive coupling among units that tends to homogenize the parameters of the ensemble. We derive an effective theory for linear networks to show that the coarse-grained behavior of our system is equivalent to a deformed Ginzburg-Landau model with quenched disorder. This framework predicts depth-dependent disorder-order-disorder phase transitions in the parameters' solutions that reveal a depth-delayed onset of a collective learning phase and a low-rank microscopic learning path. We validate the theory in coupled ensembles of realistic neural networks trained on the MNIST dataset under privacy constraints. Interestingly, experiments confirm that individual networks -- trained on private data -- can fully generalize to unseen data classes when the collective learning phase emerges. Our work establishes the physics of collective learning and contributes to the mechanistic interpretability of deep learning in decentralized settings.

相關內容

Learning

關注 0

Analysis · 近似 · 泛函 · 講稿 · 估計/估計量 ·

2023 年 12 月 31 日

Convergence analysis of Laguerre approximations for analytic functions

Haiyong Wang

from arxiv, Math. Comp., to appear

Laguerre spectral approximations play an important role in the development of efficient algorithms for problems in unbounded domains. In this paper, we present a comprehensive convergence rate analysis of Laguerre spectral approximations for analytic functions. By exploiting contour integral techniques from complex analysis, we prove that Laguerre projection and interpolation methods of degree $n$ converge at the root-exponential rate $O(\exp(-2\rho\sqrt{n}))$ with $\rho>0$ when the underlying function is analytic inside and on a parabola with focus at the origin and vertex at $z=-\rho^2$. As far as we know, this is the first rigorous proof of root-exponential convergence of Laguerre approximations for analytic functions. Several important applications of our analysis are also discussed, including Laguerre spectral differentiations, Gauss-Laguerre quadrature rules, the scaling factor and the Weeks method for the inversion of Laplace transform, and some sharp convergence rate estimates are derived. Numerical experiments are presented to verify the theoretical results.

Learning · Networking · Neural Networks · 隨機動力系統 · 離散化 ·

2023 年 12 月 30 日

Learning effective dynamics from data-driven stochastic systems

Lingyu Feng,Ting Gao,Min Dai,Jinqiao Duan

Multiscale stochastic dynamical systems have been widely adopted to a variety of scientific and engineering problems due to their capability of depicting complex phenomena in many real world applications. This work is devoted to investigating the effective dynamics for slow-fast stochastic dynamical systems. Given observation data on a short-term period satisfying some unknown slow-fast stochastic systems, we propose a novel algorithm including a neural network called Auto-SDE to learn invariant slow manifold. Our approach captures the evolutionary nature of a series of time-dependent autoencoder neural networks with the loss constructed from a discretized stochastic differential equation. Our algorithm is also validated to be accurate, stable and effective through numerical experiments under various evaluation metrics.

SimPLe · 離散數學 ·

2023 年 12 月 29 日

On the complexity of a maintenance problem for hierarchical systems

Andreas S. Schulz,Claudio Telha

We prove that a maintenance problem on frequency-constrained maintenance jobs with a hierarchical structure is integer-factorization hard. This result holds even on simple systems with just two components to maintain. As a corollary, we provide a first hardness result for Levi et al.'s modular maintenance scheduling problem (Naval Research Logistics 61, 472-488, 2014).

可約的 · 秩 · 經驗風險 · 優化器 · 正則化項 ·

2023 年 12 月 28 日

A randomized algorithm to solve reduced rank operator regression

Giacomo Turri,Vladimir Kostic,Pietro Novelli,Massimiliano Pontil

from arxiv, 19 pages, 3 figures, 1 table

We present and analyze an algorithm designed for addressing vector-valued regression problems involving possibly infinite-dimensional input and output spaces. The algorithm is a randomized adaptation of reduced rank regression, a technique to optimally learn a low-rank vector-valued function (i.e. an operator) between sampled data via regularized empirical risk minimization with rank constraints. We propose Gaussian sketching techniques both for the primal and dual optimization objectives, yielding Randomized Reduced Rank Regression (R4) estimators that are efficient and accurate. For each of our R4 algorithms we prove that the resulting regularized empirical risk is, in expectation w.r.t. randomness of a sketch, arbitrarily close to the optimal value when hyper-parameteres are properly tuned. Numerical expreriments illustrate the tightness of our bounds and show advantages in two distinct scenarios: (i) solving a vector-valued regression problem using synthetic and large-scale neuroscience datasets, and (ii) regressing the Koopman operator of a nonlinear stochastic dynamical system.

類別 · 近似 · 核化 · MoDELS · 講稿 ·

2023 年 12 月 28 日

Systems of nonlocal balance laws For Dense multilane vehicular traffic

Aekta Aggarwal,Helge Holden,Ganesh Vaidya

We discuss a class of coupled system of nonlocal balance laws modeling multilane traffic, with the nonlocality present in both convective and source terms. The uniqueness and existence of the entropy solution is proven via doubling of the variables arguments and convergent finite volume approximations, respectively. The numerical approximations are proven to converge to the unique entropy solution of the system at the rate $\sqrt{\Delta t}$. The applicability of the proven theory to a general class of systems of nonlocal balance laws coupled strongly through the convective part and weakly through the source part, is also indicated. Numerical simulations illustrating the theory and the behavior of the entropy solution as the support of the kernel goes to zero(nonlocal to local limit), are shown.

MoDELS · Learning · 稀疏 · Machine Learning · 機器學習模型 ·

2023 年 12 月 28 日

Towards provably efficient quantum algorithms for large-scale machine-learning models

Junyu Liu,Minzhao Liu,Jin-Peng Liu,Ziyu Ye,Yunfei Wang,Yuri Alexeev,Jens Eisert,Liang Jiang

from arxiv, 7+40 pages, 3+10 figures, replaced with final version providing substantial detail

Large machine learning models are revolutionary technologies of artificial intelligence whose bottlenecks include huge computational expenses, power, and time used both in the pre-training and fine-tuning process. In this work, we show that fault-tolerant quantum computing could possibly provide provably efficient resolutions for generic (stochastic) gradient descent algorithms, scaling as O(T^2 polylog(n)), where n is the size of the models and T is the number of iterations in the training, as long as the models are both sufficiently dissipative and sparse, with small learning rates. Based on earlier efficient quantum algorithms for dissipative differential equations, we find and prove that similar algorithms work for (stochastic) gradient descent, the primary algorithm for machine learning. In practice, we benchmark instances of large machine learning models from 7 million to 103 million parameters. We find that, in the context of sparse training, a quantum enhancement is possible at the early stage of learning after model pruning, motivating a sparse parameter download and re-upload scheme. Our work shows solidly that fault-tolerant quantum algorithms could potentially contribute to most state-of-the-art, large-scale machine-learning problems.

GROUP · PCA · 泛函 · MoDELS · 簇 ·

2023 年 12 月 27 日

A Bayesian functional PCA model with multilevel partition priors for group studies in neuroscience

Nicolò Margaritella,Vanda Inácio,Ruth King

The statistical analysis of group studies in neuroscience is particularly challenging due to the complex spatio-temporal nature of the data, its multiple levels and the inter-individual variability in brain responses. In this respect, traditional ANOVA-based studies and linear mixed effects models typically provide only limited exploration of the dynamic of the group brain activity and variability of the individual responses potentially leading to overly simplistic conclusions and/or missing more intricate patterns. In this study we propose a novel method based on functional Principal Components Analysis and Bayesian model-based clustering to simultaneously assess group effects and individual deviations over the most important temporal features in the data. This method provides a thorough exploration of group differences and individual deviations in neuroscientific group studies without compromising on the spatio-temporal nature of the data. By means of a simulation study we demonstrate that the proposed model returns correct classification in different clustering scenarios under low and high of noise levels in the data. Finally we consider a case study using Electroencephalogram data recorded during an object recognition task where our approach provides new insights into the underlying brain mechanisms generating the data and their variability.

泛化理論 · 黑盒 · 學成 · INFORMS · 監督學習算法 ·

2021 年 10 月 4 日

Information-theoretic generalization bounds for black-box learning algorithms

Hrayr Harutyunyan,Maxim Raginsky,Greg Ver Steeg,Aram Galstyan

from arxiv, NeurIPS 2021

We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.

學成 · 深度學習 · Continuity · 貝葉斯推斷 · Networking ·

2020 年 12 月 20 日

Recent advances in deep learning theory

Fengxiang He,Dacheng Tao

Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and negative perspectives; (5) theoretical foundations of several special structures in network architectures; and (6) the increasingly intensive concerns in ethics and security and their relationships with generalizability.

Neural Networks · 優化器 · Networks · 局部極小 · Networking ·

2019 年 12 月 19 日

Optimization for deep learning: theory and algorithms

Ruoyu Sun

from arxiv, 38 pages of main body; 5 pages of appendix; 12 pages of references

When and why can a neural network be successfully trained? This article provides an overview of optimization algorithms and theory for training neural networks. First, we discuss the issue of gradient explosion/vanishing and the more general issue of undesirable spectrum, and then discuss practical solutions including careful initialization and normalization methods. Second, we review generic optimization methods used in training neural networks, such as SGD, adaptive gradient methods and distributed methods, and theoretical results for these algorithms. Third, we review existing research on the global issues of neural network training, including results on bad local minima, mode connectivity, lottery ticket hypothesis and infinite-width analysis.