苍井空无码免费换线_国产欧美日韩综合在线_久久久久久91香蕉国产_成人网站免费在线_在线免费小电影日本欧美一区_国产无遮挡裸露视频免费无码_深一点疼快再深一点娇喘视频

Nonlinear parametric systems have been widely used in modeling nonlinear dynamics in science and engineering. Bifurcation analysis of these nonlinear systems on the parameter space are usually used to study the solution structure such as the number of solutions and the stability. In this paper, we develop a new machine learning approach to compute the bifurcations via so-called equation-driven neural networks (EDNNs). The EDNNs consist of a two-step optimization: the first step is to approximate the solution function of the parameter by training empirical solution data; the second step is to compute bifurcations by using the approximated neural network obtained in the first step. Both theoretical convergence analysis and numerical implementation on several examples have been performed to demonstrate the feasibility of the proposed method.

相關內容

Neural Networks

關注 1648

神(shen)(shen)(shen)經(jing)(jing)網(wang)絡(luo)(luo)(luo)（Neural Networks）是世界上三個最古老的(de)神(shen)(shen)(shen)經(jing)(jing)建(jian)模(mo)學(xue)(xue)(xue)(xue)(xue)(xue)(xue)會(hui)的(de)檔(dang)案期刊(kan):國際(ji)神(shen)(shen)(shen)經(jing)(jing)網(wang)絡(luo)(luo)(luo)學(xue)(xue)(xue)(xue)(xue)(xue)(xue)會(hui)(INNS)、歐洲(zhou)神(shen)(shen)(shen)經(jing)(jing)網(wang)絡(luo)(luo)(luo)學(xue)(xue)(xue)(xue)(xue)(xue)(xue)會(hui)(ENNS)和(he)(he)日本神(shen)(shen)(shen)經(jing)(jing)網(wang)絡(luo)(luo)(luo)學(xue)(xue)(xue)(xue)(xue)(xue)(xue)會(hui)(JNNS)。神(shen)(shen)(shen)經(jing)(jing)網(wang)絡(luo)(luo)(luo)提供了(le)一個論壇，以發展(zhan)和(he)(he)培(pei)育一個國際(ji)社(she)會(hui)的(de)學(xue)(xue)(xue)(xue)(xue)(xue)(xue)者和(he)(he)實踐者感興(xing)趣(qu)的(de)所有(you)方(fang)面的(de)神(shen)(shen)(shen)經(jing)(jing)網(wang)絡(luo)(luo)(luo)和(he)(he)相關方(fang)法的(de)計算(suan)智能(neng)。神(shen)(shen)(shen)經(jing)(jing)網(wang)絡(luo)(luo)(luo)歡(huan)迎高質量論文(wen)的(de)提交(jiao)(jiao)，有(you)助于全面的(de)神(shen)(shen)(shen)經(jing)(jing)網(wang)絡(luo)(luo)(luo)研究(jiu)，從行為(wei)和(he)(he)大(da)腦(nao)建(jian)模(mo)，學(xue)(xue)(xue)(xue)(xue)(xue)(xue)習算(suan)法，通過(guo)數學(xue)(xue)(xue)(xue)(xue)(xue)(xue)和(he)(he)計算(suan)分析(xi)，系(xi)統的(de)工程(cheng)(cheng)和(he)(he)技術應用(yong)，大(da)量使用(yong)神(shen)(shen)(shen)經(jing)(jing)網(wang)絡(luo)(luo)(luo)的(de)概(gai)念和(he)(he)技術。這(zhe)一獨特(te)而廣泛的(de)范圍促進了(le)生物(wu)(wu)和(he)(he)技術研究(jiu)之間的(de)思想交(jiao)(jiao)流，并有(you)助于促進對生物(wu)(wu)啟發的(de)計算(suan)智能(neng)感興(xing)趣(qu)的(de)跨學(xue)(xue)(xue)(xue)(xue)(xue)(xue)科(ke)(ke)社(she)區的(de)發展(zhan)。因此，神(shen)(shen)(shen)經(jing)(jing)網(wang)絡(luo)(luo)(luo)編委會(hui)代表的(de)專(zhuan)家領域包括心理學(xue)(xue)(xue)(xue)(xue)(xue)(xue)，神(shen)(shen)(shen)經(jing)(jing)生物(wu)(wu)學(xue)(xue)(xue)(xue)(xue)(xue)(xue)，計算(suan)機科(ke)(ke)學(xue)(xue)(xue)(xue)(xue)(xue)(xue)，工程(cheng)(cheng)，數學(xue)(xue)(xue)(xue)(xue)(xue)(xue)，物(wu)(wu)理。該雜志(zhi)發表文(wen)章、信(xin)件和(he)(he)評論以及給編輯的(de)信(xin)件、社(she)論、時事、軟件調(diao)查(cha)和(he)(he)專(zhuan)利信(xin)息。文(wen)章發表在五個部分之一:認知科(ke)(ke)學(xue)(xue)(xue)(xue)(xue)(xue)(xue)，神(shen)(shen)(shen)經(jing)(jing)科(ke)(ke)學(xue)(xue)(xue)(xue)(xue)(xue)(xue)，學(xue)(xue)(xue)(xue)(xue)(xue)(xue)習系(xi)統，數學(xue)(xue)(xue)(xue)(xue)(xue)(xue)和(he)(he)計算(suan)分析(xi)、工程(cheng)(cheng)和(he)(he)應用(yong)。官網(wang)地址：

再生核希爾伯特空間 · 動力系統 · Processing（編程語言） · Lipschitz · 類別 ·

2022 年 2 月 19 日

Single Trajectory Nonparametric Learning of Nonlinear Dynamics

Ingvar Ziemann,Henrik Sandberg,Nikolai Matni

Given a single trajectory of a dynamical system, we analyze the performance of the nonparametric least squares estimator (LSE). More precisely, we give nonasymptotic expected $l^2$-distance bounds between the LSE and the true regression function, where expectation is evaluated on a fresh, counterfactual, trajectory. We leverage recently developed information-theoretic methods to establish the optimality of the LSE for nonparametric hypotheses classes in terms of supremum norm metric entropy and a subgaussian parameter. Next, we relate this subgaussian parameter to the stability of the underlying process using notions from dynamical systems theory. When combined, these developments lead to rate-optimal error bounds that scale as $T^{-1/(2+q)}$ for suitably stable processes and hypothesis classes with metric entropy growth of order $\delta^{-q}$. Here, $T$ is the length of the observed trajectory, $\delta \in \mathbb{R}_+$ is the packing granularity and $q\in (0,2)$ is a complexity term. Finally, we specialize our results to a number of scenarios of practical interest, such as Lipschitz dynamics, generalized linear models, and dynamics described by functions in certain classes of Reproducing Kernel Hilbert Spaces (RKHS).

Facebook AI Research · Integration · 判別器 · MoDELS · 學成 ·

2022 年 2 月 17 日

Learning fair representation with a parametric integral probability metric

Dongha Kim,Kunwoong Kim,Insung Kong,Ilsang Ohn,Yongdai Kim

from arxiv, 24 pages, including references and appendix

As they have a vital effect on social decision-making, AI algorithms should be not only accurate but also fair. Among various algorithms for fairness AI, learning fair representation (LFR), whose goal is to find a fair representation with respect to sensitive variables such as gender and race, has received much attention. For LFR, the adversarial training scheme is popularly employed as is done in the generative adversarial network type algorithms. The choice of a discriminator, however, is done heuristically without justification. In this paper, we propose a new adversarial training scheme for LFR, where the integral probability metric (IPM) with a specific parametric family of discriminators is used. The most notable result of the proposed LFR algorithm is its theoretical guarantee about the fairness of the final prediction model, which has not been considered yet. That is, we derive theoretical relations between the fairness of representation and the fairness of the prediction model built on the top of the representation (i.e., using the representation as the input). Moreover, by numerical experiments, we show that our proposed LFR algorithm is computationally lighter and more stable, and the final prediction model is competitive or superior to other LFR algorithms using more complex discriminators.

優化器 · Neural Networks · AdaGrad · 移動平均 · 泛化理論 ·

2021 年 7 月 5 日

The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks

Bohan Wang,Qi Meng,Wei Chen,Tie-Yan Liu

from arxiv, ICML 2021

Despite their overwhelming capacity to overfit, deep neural networks trained by specific optimization algorithms tend to generalize well to unseen data. Recently, researchers explained it by investigating the implicit regularization effect of optimization algorithms. A remarkable progress is the work (Lyu&Li, 2019), which proves gradient descent (GD) maximizes the margin of homogeneous deep neural networks. Except GD, adaptive algorithms such as AdaGrad, RMSProp and Adam are popular owing to their rapid training process. However, theoretical guarantee for the generalization of adaptive optimization algorithms is still lacking. In this paper, we study the implicit regularization of adaptive optimization algorithms when they are optimizing the logistic loss on homogeneous deep neural networks. We prove that adaptive algorithms that adopt exponential moving average strategy in conditioner (such as Adam and RMSProp) can maximize the margin of the neural network, while AdaGrad that directly sums historical squared gradients in conditioner can not. It indicates superiority on generalization of exponential moving average strategy in the design of the conditioner. Technically, we provide a unified framework to analyze convergent direction of adaptive optimization algorithms by constructing novel adaptive gradient flow and surrogate margin. Our experiments can well support the theoretical findings on convergent direction of adaptive optimization algorithms.

優化器 · Extensibility · 最優化 · Automator · Neural Networks ·

2020 年 3 月 12 日

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Tong Yu,Hong Zhu

Since deep neural networks were developed, they have made huge contributions to everyday lives. Machine learning provides more rational advice than humans are capable of in almost every aspect of daily life. However, despite this achievement, the design and training of neural networks are still challenging and unpredictable procedures. To lower the technical thresholds for common users, automated hyper-parameter optimization (HPO) has become a popular topic in both academic and industrial areas. This paper provides a review of the most essential topics on HPO. The first section introduces the key hyper-parameters related to model training and structure, and discusses their importance and methods to define the value range. Then, the research focuses on major optimization algorithms and their applicability, covering their efficiency and accuracy especially for deep learning networks. This study next reviews major services and toolkits for HPO, comparing their support for state-of-the-art searching algorithms, feasibility with major deep learning frameworks, and extensibility for new modules designed by users. The paper concludes with problems that exist when HPO is applied to deep learning, a comparison between optimization algorithms, and prominent approaches for model evaluation with limited computational resources.

Neural Networks · 優化器 · Networks · 局部極小 · Networking ·

2019 年 12 月 19 日

Optimization for deep learning: theory and algorithms

Ruoyu Sun

from arxiv, 38 pages of main body; 5 pages of appendix; 12 pages of references

When and why can a neural network be successfully trained? This article provides an overview of optimization algorithms and theory for training neural networks. First, we discuss the issue of gradient explosion/vanishing and the more general issue of undesirable spectrum, and then discuss practical solutions including careful initialization and normalization methods. Second, we review generic optimization methods used in training neural networks, such as SGD, adaptive gradient methods and distributed methods, and theoretical results for these algorithms. Third, we review existing research on the global issues of neural network training, including results on bad local minima, mode connectivity, lottery ticket hypothesis and infinite-width analysis.

離散化 · 圖 · 圖形處理器 · Neural Networks · Networking ·

2019 年 5 月 17 日

Learning Discrete Structures for Graph Neural Networks

Luca Franceschi,Mathias Niepert,Massimiliano Pontil,Xiao He

from arxiv, To appear as a conference paper at ICML 2019, code at //github.com/lucfra/LDS

Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.

小樣本學習 · 學習器 · 學成 · 度量學習 · 類別 ·

2019 年 1 月 26 日

Few-shot Learning with Meta Metric Learners

Yu Cheng,Mo Yu,Xiaoxiao Guo,Bowen Zhou

from arxiv, Published in NIPS 2017 workshop on Meta-Learning, arXiv version

Few-shot Learning aims to learn classifiers for new classes with only a few training examples per class. Existing meta-learning or metric-learning based few-shot learning approaches are limited in handling diverse domains with various number of labels. The meta-learning approaches train a meta learner to predict weights of homogeneous-structured task-specific networks, requiring a uniform number of classes across tasks. The metric-learning approaches learn one task-invariant metric for all the tasks, and they fail if the tasks diverge. We propose to deal with these limitations with meta metric learning. Our meta metric learning approach consists of task-specific learners, that exploit metric learning to handle flexible labels, and a meta learner, that discovers good parameters and gradient decent to specify the metrics in task-specific learners. Thus the proposed model is able to handle unbalanced classes as well as to generate task-specific metrics. We test our approach in the `$k$-shot $N$-way' few-shot learning setting used in previous work and new realistic few-shot setting with diverse multi-domain tasks and flexible label numbers. Experiments show that our approach attains superior performances in both settings.

流形 · 可理解性 · 整流線性 · 學成 · 深度學習 ·

2018 年 5 月 31 日

Geometric Understanding of Deep Learning

Na Lei,Zhongxuan Luo,Shing-Tung Yau,David Xianfeng Gu

Deep learning is the mainstream technique for many machine learning tasks, including image recognition, machine translation, speech recognition, and so on. It has outperformed conventional methods in various fields and achieved great successes. Unfortunately, the understanding on how it works remains unclear. It has the central importance to lay down the theoretic foundation for deep learning. In this work, we give a geometric view to understand deep learning: we show that the fundamental principle attributing to the success is the manifold structure in data, namely natural high dimensional data concentrates close to a low-dimensional manifold, deep learning learns the manifold and the probability distribution on it. We further introduce the concepts of rectified linear complexity for deep neural network measuring its learning capability, rectified linear complexity of an embedding manifold describing the difficulty to be learned. Then we show for any deep neural network with fixed architecture, there exists a manifold that cannot be learned by the network. Finally, we propose to apply optimal mass transportation theory to control the probability distribution in the latent space.

度量學習 · 學成 · 層 · 馬哈拉諾比斯距離 · 特征變換 ·

2018 年 5 月 15 日

Online Deep Metric Learning

Wenbin Li,Jing Huo,Yinghuan Shi,Yang Gao,Lei Wang,Jiebo Luo

from arxiv, 9 pages

Metric learning learns a metric function from training data to calculate the similarity or distance between samples. From the perspective of feature learning, metric learning essentially learns a new feature space by feature transformation (e.g., Mahalanobis distance metric). However, traditional metric learning algorithms are shallow, which just learn one metric space (feature transformation). Can we further learn a better metric space from the learnt metric space? In other words, can we learn metric progressively and nonlinearly like deep learning by just using the existing metric learning algorithms? To this end, we present a hierarchical metric learning scheme and implement an online deep metric learning framework, namely ODML. Specifically, we take one online metric learning algorithm as a metric layer, followed by a nonlinear layer (i.e., ReLU), and then stack these layers modelled after the deep learning. The proposed ODML enjoys some nice properties, indeed can learn metric progressively and performs superiorly on some datasets. Various experiments with different settings have been conducted to verify these properties of the proposed ODML.

優化器 · Extensibility · 對偶問題 · 平滑 · INTERACT ·

2017 年 12 月 1 日

Optimal Algorithms for Distributed Optimization

César A. Uribe,Soomin Lee,Alexander Gasnikov,Angelia Nedi?

In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.