高清国产三级在线播放,97人人模人人妻人人添,人人操人人摸日日操,免费一级做A爰片久久毛片潮

Accelerated gradient methods are the cornerstones of large-scale, data-driven optimization problems that arise naturally in machine learning and other fields concerning data analysis. We introduce a gradient-based optimization framework for achieving acceleration, based on the recently introduced notion of fixed-time stability of dynamical systems. The method presents itself as a generalization of simple gradient-based methods suitably scaled to achieve convergence to the optimizer in a fixed-time, independent of the initialization. We achieve this by first leveraging a continuous-time framework for designing fixed-time stable dynamical systems, and later providing a consistent discretization strategy, such that the equivalent discrete-time algorithm tracks the optimizer in a practically fixed number of iterations. We also provide a theoretical analysis of the convergence behavior of the proposed gradient flows, and their robustness to additive disturbances for a range of functions obeying strong convexity, strict convexity, and possibly nonconvexity but satisfying the Polyak-{\L}ojasiewicz inequality. We also show that the regret bound on the convergence rate is constant by virtue of the fixed-time convergence. The hyperparameters have intuitive interpretations and can be tuned to fit the requirements on the desired convergence rates. We validate the accelerated convergence properties of the proposed schemes on a range of numerical examples against the state-of-the-art optimization algorithms. Our work provides insights on developing novel optimization algorithms via discretization of continuous-time flows.

相關內容

優化器

關注 4

離散化 · 徑向基函數 · Better · 正則化項 · SimPLe ·

2022 年 2 月 6 日

Consistency and Convergence of a High Order Accurate Meshless Method for Solution of Incompressible Fluid Flows

Shantanu Shahane,Surya Pratap Vanka

Computations of incompressible flows with velocity boundary conditions require solution of a Poisson equation for pressure with all Neumann boundary conditions. Discretization of such a Poisson equation results in a rank-deficient matrix of coefficients. When a non-conservative discretization method such as finite difference, finite element, or spectral scheme is used, such a matrix also generates an inconsistency which makes the residuals in the iterative solution to saturate at a threshold level that depends on the spatial resolution and order of the discretization scheme. In this paper, we examine inconsistency for a high-order meshless discretization scheme suitable for solving the equations on a complex domain. The high order meshless method uses polyharmonic spline radial basis functions (PHS-RBF) with appended polynomials to interpolate scattered data and constructs the discrete equations by collocation. The PHS-RBF provides the flexibility to vary the order of discretization by increasing the degree of the appended polynomial. In this study, we examine the convergence of the inconsistency for different spatial resolutions and for different degrees of the appended polynomials by solving the Poisson equation for a manufactured solution as well as the Navier-Stokes equations for several fluid flows. We observe that the inconsistency decreases faster than the error in the final solution, and eventually becomes vanishing small at sufficient spatial resolution. The rate of convergence of the inconsistency is observed to be similar or better than the rate of convergence of the discretization errors. This beneficial observation makes it unnecessary to regularize the Poisson equation by fixing either the mean pressure or pressure at an arbitrary point. A simple point solver such as the SOR is seen to be well-convergent, although it can be further accelerated using multilevel methods.

優化器 · 控制器 · MoDELS · 極小點 · 在線 ·

2022 年 2 月 5 日

Superconvergence of Online Optimization for Model Predictive Control

Sen Na,Mihai Anitescu

from arxiv, 21 pages

We develop a one-Newton-step-per-horizon, online, lag-$L$, model predictive control (MPC) algorithm for solving discrete-time, equality-constrained, nonlinear dynamic programs. Based on recent sensitivity analysis results for the target problems class, we prove that the approach exhibits a behavior that we call superconvergence; that is, the tracking error with respect to the full horizon solution is not only stable for successive horizon shifts, but also decreases with increasing shift order to a minimum value that decays exponentially in the length of the receding horizon. The key analytical step is the decomposition of the one-step error recursion of our algorithm into algorithmic error and perturbation error. We show that the perturbation error decays exponentially with the lag between two consecutive receding horizons, while~the algorithmic error, determined by Newton's method, achieves quadratic convergence instead. Overall this approach induces our local exponential convergence result in terms of the receding horizon length for suitable values of $L$. Numerical experiments validate our theoretical findings.

離散化 · Conformer · 優化器 · 近似 · MoDELS ·

2022 年 2 月 4 日

Convergence Analysis of Virtual Element Method for Nonlinear Nonlocal Dynamic Plate Equation

D. Adak,D. Mora,S. Natarajan

In this article, we have considered a nonlinear nonlocal time dependent fourth order equation demonstrating the deformation of a thin and narrow rectangular plate. We propose $C^1$ conforming virtual element method (VEM) of arbitrary order, $k\ge2$, to approximate the model problem numerically. We employ VEM to discretize the space variable and fully implicit scheme for temporal variable. Well-posedness of the fully discrete scheme is proved under certain conditions on the physical parameters, and we derive optimal order of convergence in both space and time variable. Finally, numerical experiments are presented to illustrate the behaviour of the proposed numerical scheme.

Neural Networks · Networking · 感知機 · 計算學習理論 ·

2022 年 2 月 4 日

A note on the complex and bicomplex valued neural networks

Daniel Alpay,Kamal Diki,Mihaela Vajiac

In this paper we first write a proof of the perceptron convergence algorithm for the complex multivalued neural networks (CMVNNs). Our primary goal is to formulate and prove the perceptron convergence algorithm for the bicomplex multivalued neural networks (BMVNNs) and other important results in the theory of neural networks based on a bicomplex algebra.

隨機梯度下降 · CASE · SGD · 分解的 · 操作 ·

2022 年 2 月 4 日

Polynomial convergence of iterations of certain random operators in Hilbert space

Soumyadip Ghosh,Yingdong Lu,Tomasz J. Nowicki

We study the convergence of random iterative sequence of a family of operators on infinite dimensional Hilbert spaces, which are inspired by the Stochastic Gradient Descent (SGD) algorithm in the case of the noiseless regression, as studied in [1]. We demonstrate that its polynomial convergence rate depends on the initial state, while the randomness plays a role only in the choice of the best constant factor and we close the gap between the upper and lower bounds.

優化器 · Performer · Extensibility · FAST · 約束 ·

2022 年 2 月 3 日

Policy Optimization for Constrained MDPs with Provable Fast Global Convergence

Tao Liu,Ruida Zhou,Dileep Kalathil,P. R. Kumar,Chao Tian

We address the problem of finding the optimal policy of a constrained Markov decision process (CMDP) using a gradient descent-based algorithm. Previous results have shown that a primal-dual approach can achieve an $\mathcal{O}(1/\sqrt{T})$ global convergence rate for both the optimality gap and the constraint violation. We propose a new algorithm called policy mirror descent-primal dual (PMD-PD) algorithm that can provably achieve a faster $\mathcal{O}(\log(T)/T)$ convergence rate for both the optimality gap and the constraint violation. For the primal (policy) update, the PMD-PD algorithm utilizes a modified value function and performs natural policy gradient steps, which is equivalent to a mirror descent step with appropriate regularization. For the dual update, the PMD-PD algorithm uses modified Lagrange multipliers to ensure a faster convergence rate. We also present two extensions of this approach to the settings with zero constraint violation and sample-based estimation. Experimental results demonstrate the faster convergence rate and the better performance of the PMD-PD algorithm compared with existing policy gradient-based algorithms.

優化器 · 可約的 · 近似 · 控制器 · Principle ·

2020 年 6 月 29 日

Differential Dynamic Programming Neural Optimizer

Guan-Horng Liu,Tianrong Chen,Evangelos A. Theodorou

Interpretation of Deep Neural Networks (DNNs) training as an optimal control problem with nonlinear dynamical systems has received considerable attention recently, yet the algorithmic development remains relatively limited. In this work, we make an attempt along this line by reformulating the training procedure from the trajectory optimization perspective. We first show that most widely-used algorithms for training DNNs can be linked to the Differential Dynamic Programming (DDP), a celebrated second-order trajectory optimization algorithm rooted in the Approximate Dynamic Programming. In this vein, we propose a new variant of DDP that can accept batch optimization for training feedforward networks, while integrating naturally with the recent progress in curvature approximation. The resulting algorithm features layer-wise feedback policies which improve convergence rate and reduce sensitivity to hyper-parameter over existing methods. We show that the algorithm is competitive against state-ofthe-art first and second order methods. Our work opens up new avenues for principled algorithmic design built upon the optimal control theory.

Neural Networks · 優化器 · Networks · 局部極小 · Networking ·

2019 年 12 月 19 日

Optimization for deep learning: theory and algorithms

Ruoyu Sun

from arxiv, 38 pages of main body; 5 pages of appendix; 12 pages of references

When and why can a neural network be successfully trained? This article provides an overview of optimization algorithms and theory for training neural networks. First, we discuss the issue of gradient explosion/vanishing and the more general issue of undesirable spectrum, and then discuss practical solutions including careful initialization and normalization methods. Second, we review generic optimization methods used in training neural networks, such as SGD, adaptive gradient methods and distributed methods, and theoretical results for these algorithms. Third, we review existing research on the global issues of neural network training, including results on bad local minima, mode connectivity, lottery ticket hypothesis and infinite-width analysis.

泛化理論 · INFORMS · 情景 · 通用智能 · Facebook AI Research ·

2019 年 11 月 5 日

The Measure of Intelligence

Fran?ois Chollet

To make deliberate progress towards more intelligent and more human-like artificial systems, we need to be following an appropriate feedback signal: we need to be able to define and evaluate intelligence in a way that enables comparisons between two systems, as well as comparisons with humans. Over the past hundred years, there has been an abundance of attempts to define and measure intelligence, across both the fields of psychology and AI. We summarize and critically assess these definitions and evaluation approaches, while making apparent the two historical conceptions of intelligence that have implicitly guided them. We note that in practice, the contemporary AI community still gravitates towards benchmarking intelligence by comparing the skill exhibited by AIs and humans at specific tasks such as board games and video games. We argue that solely measuring skill at any given task falls short of measuring intelligence, because skill is heavily modulated by prior knowledge and experience: unlimited priors or unlimited training data allow experimenters to "buy" arbitrary levels of skills for a system, in a way that masks the system's own generalization power. We then articulate a new formal definition of intelligence based on Algorithmic Information Theory, describing intelligence as skill-acquisition efficiency and highlighting the concepts of scope, generalization difficulty, priors, and experience. Using this definition, we propose a set of guidelines for what a general AI benchmark should look like. Finally, we present a benchmark closely following these guidelines, the Abstraction and Reasoning Corpus (ARC), built upon an explicit set of priors designed to be as close as possible to innate human priors. We argue that ARC can be used to measure a human-like form of general fluid intelligence and that it enables fair general intelligence comparisons between AI systems and humans.

優化器 · Extensibility · 對偶問題 · 平滑 · INTERACT ·

2017 年 12 月 1 日

Optimal Algorithms for Distributed Optimization

César A. Uribe,Soomin Lee,Alexander Gasnikov,Angelia Nedi?

In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.