苹果电影在线观看免费高清_日韩欧美国产AⅤ另类_视频日韩P影院永久免费_日韩中文字幕免费的视频在线看网站_五月丁香色婷婷综合久久_一级性色生活片久久毛片_亚洲欧美日韩在线观看

Flexible sparsity regularization means stably approximating sparse solutions of operator equations by using coefficient-dependent penalizations. We propose and analyse a general nonconvex approach in this respect, from both theoretical and numerical perspectives. Namely, we show convergence of the regularization method and establish convergence properties of a couple of majorization approaches for the associated nonconvex problems. We also test a monotone algorithm for an academic example where the operator is an $M$ matrix, and on a time-dependent optimal control problem, pointing out the advantages of employing variable penalties over a fixed penalty.

相關內容

非凸

關注 0

正則化項 · Networking · ReLU · 特化 · 優化器 ·

2022 年 1 月 14 日

Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks

Tolga Ergen,Mert Pilanci

Understanding the fundamental principles behind the success of deep neural networks is one of the most important open questions in the current literature. To this end, we study the training problem of deep neural networks and introduce an analytic approach to unveil hidden convexity in the optimization landscape. We consider a deep parallel ReLU network architecture, which also includes standard deep networks and ResNets as its special cases. We then show that pathwise regularized training problems can be represented as an exact convex optimization problem. We further prove that the equivalent convex problem is regularized via a group sparsity inducing norm. Thus, a path regularized parallel ReLU network can be viewed as a parsimonious convex model in high dimensions. More importantly, we show that the computational complexity required to globally optimize the equivalent convex problem is fully polynomial-time in feature dimension and number of samples. Therefore, we prove polynomial-time trainability of path regularized ReLU networks with global optimality guarantees. We also provide several numerical experiments corroborating our theory.

線性的 · 模型評估 · 稀疏 · 可辨認的 · PDE ·

2022 年 1 月 14 日

Evaluating Accuracy and Efficiency of HPC Solvers for Sparse Linear Systems with Applications to PDEs

Antonella Galizia,Simone Cammarasana,Andrea Clematis,Giuseppe Patane'

Partial Differential Equations (PDEs) describe several problems relevant to many fields of applied sciences, and their discrete counterparts typically involve the solution of sparse linear systems. In this context, we focus on the analysis of the computational aspects related to the solution of large and sparse linear systems with HPC solvers, by considering the performances of direct and iterative solvers in terms of computational efficiency, scalability, and numerical accuracy. Our aim is to identify the main criteria to support application-domain specialists in the selection of the most suitable solvers, according to the application requirements and available resources. To this end, we discuss how the numerical solver is affected by the regular/irregular discretisation of the input domain, the discretisation of the input PDE with piecewise linear or polynomial basis functions, which generally result in a higher/lower sparsity of the coefficient matrix, and the choice of different initial conditions, which are associated with linear systems with multiple right-hand side terms. Finally, our analysis is independent of the characteristics of the underlying computational architectures, and provides a methodological approach that can be applied to different classes of PDEs or with approximation problems.

正則化項 · 早停 · 動量 · tuning · 貝葉斯風險 ·

2022 年 1 月 14 日

The Implicit Regularization of Momentum Gradient Descent with Early Stopping

Li Wang,Yingcong Zhou,Zhiguo Fu

from arxiv, 7 pages, 2 figures

The study on the implicit regularization induced by gradient-based optimization is a longstanding pursuit. In the present paper, we characterize the implicit regularization of momentum gradient descent (MGD) with early stopping by comparing with the explicit $\ell_2$-regularization (ridge). In details, we study MGD in the continuous-time view, so-called momentum gradient flow (MGF), and show that its tendency is closer to ridge than the gradient descent (GD) [Ali et al., 2019] for least squares regression. Moreover, we prove that, under the calibration $t=\sqrt{2/\lambda}$, where $t$ is the time parameter in MGF and $\lambda$ is the tuning parameter in ridge regression, the risk of MGF is no more than 1.54 times that of ridge. In particular, the relative Bayes risk of MGF to ridge is between 1 and 1.035 under the optimal tuning. The numerical experiments support our theoretical results strongly.

跡 · 離散化 · 奇異的 · 模型評估 · 流形 ·

2022 年 1 月 14 日

A Geometrically Consistent Trace Finite Element Method For The Laplace-Beltrami Eigenvalue Problem

Song Lu,Xianmin Xu

from arxiv, 23 pages, 6 figures

In this paper, we propose a new trace finite element method for the {Laplace-Beltrami} eigenvalue problem. The method is proposed directly on a smooth manifold which is implicitly given by a level-set function and require high order numerical quadrature on the surface. A comprehensive analysis for the method is provided. We show that the eigenvalues of the discrete Laplace-Beltrami operator coincide with only part of the eigenvalues of an embedded problem, which further corresponds to the finite eigenvalues for a singular generalized algebraic eigenvalue problem. The finite eigenvalues can be efficiently solved by a rank-completing perturbation algorithm in {\it Hochstenbach et al. SIAM J. Matrix Anal. Appl., 2019} \cite{hochstenbach2019solving}. We prove the method has optimal convergence rate. Numerical experiments verify the theoretical analysis and show that the geometric consistency can improve the numerical accuracy significantly.

CC · 分離的 · 示例 ·

2022 年 1 月 13 日

Scheme-theoretic Approach to Computational Complexity I. The Separation of P and NP

Ali ?ivril

from arxiv, (Almost) definitive form before submission

We lay the foundations of a new theory for algorithms and computational complexity by parameterizing the instances of a computational problem as a moduli scheme. Considering the geometry of the scheme associated to 3-SAT, we separate P and NP.

Integration · Tikhonov正則化 · 可約的 · 線性的 · 正則化 ·

2022 年 1 月 13 日

On the numerical solution of a hyperbolic inverse boundary value problem in bounded domains

Roman Chapko,Leonidas Mindrinos

from arxiv, 13 pages, 3 figures. arXiv admin note: text overlap with arXiv:1903.07412

We consider the inverse problem of reconstructing the boundary curve of a cavity embedded in a bounded domain. The problem is formulated in two dimensions for the wave equation. We combine the Laguerre transform with the integral equation method and we reduce the inverse problem to a system of boundary integral equations. We propose an iterative scheme that linearizes the equation using the Fr\'echet derivative of the forward operator. The application of special quadrature rules results to an ill-conditioned linear system which we solve using Tikhonov regularization. The numerical results show that the proposed method produces accurate and stable reconstructions.

優化器 · 查準率/準確率 · 近似 · 離散化 · Continuity ·

2022 年 1 月 13 日

Approximate solutions of convex semi-infinite optimization problems in finitely many iterations

Jochen Schmid,Miltiadis Poursanidis

from arxiv, 24 pages

We develop two adaptive discretization algorithms for convex semi-infinite optimization, which terminate after finitely many iterations at approximate solutions of arbitrary precision. In particular, they terminate at a feasible point of the considered optimization problem. Compared to the existing finitely feasible algorithms for general semi-infinite optimization problems, our algorithms work with considerably smaller discretizations and are thus computationally favorable. Also, our algorithms terminate at approximate solutions of arbitrary precision, while for general semi-infinite optimization problems the best possible approximate-solution precision can be arbitrarily bad. All occurring finite optimization subproblems in our algorithms have to be solved only approximately, and continuity is the only regularity assumption on our objective and constraint functions. Applications to parametric and non-parametric regression problems under shape constraints are discussed.

正則化項 · 去噪 · 圖像降噪 · 稀疏 · 非凸 ·

2022 年 1 月 8 日

Hyperspectral Image Denoising Using Non-convex Local Low-rank and Sparse Separation with Spatial-Spectral Total Variation Regularization

Chong Peng,Yang Liu,Yongyong Chen,Xinxin Wu,Andrew Cheng,Zhao Kang,Chenglizhao Chen,Qiang Cheng

In this paper, we propose a novel nonconvex approach to robust principal component analysis for HSI denoising, which focuses on simultaneously developing more accurate approximations to both rank and column-wise sparsity for the low-rank and sparse components, respectively. In particular, the new method adopts the log-determinant rank approximation and a novel $\ell_{2,\log}$ norm, to restrict the local low-rank or column-wisely sparse properties for the component matrices, respectively. For the $\ell_{2,\log}$-regularized shrinkage problem, we develop an efficient, closed-form solution, which is named $\ell_{2,\log}$-shrinkage operator. The new regularization and the corresponding operator can be generally used in other problems that require column-wise sparsity. Moreover, we impose the spatial-spectral total variation regularization in the log-based nonconvex RPCA model, which enhances the global piece-wise smoothness and spectral consistency from the spatial and spectral views in the recovered HSI. Extensive experiments on both simulated and real HSIs demonstrate the effectiveness of the proposed method in denoising HSIs.

非凸 · 可理解性 · 動量 · PCA · 流 ·

2018 年 10 月 1 日

Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization

Tianyi Liu,Shiyang Li,Jianping Shi,Enlu Zhou,Tuo Zhao

from arxiv, arXiv admin note: text overlap with arXiv:1802.05155

Asynchronous momentum stochastic gradient descent algorithms (Async-MSGD) is one of the most popular algorithms in distributed machine learning. However, its convergence properties for these complicated nonconvex problems is still largely unknown, because of the current technical limit. Therefore, in this paper, we propose to analyze the algorithm through a simpler but nontrivial nonconvex problem - streaming PCA, which helps us to understand Aync-MSGD better even for more general problems. Specifically, we establish the asymptotic rate of convergence of Async-MSGD for streaming PCA by diffusion approximation. Our results indicate a fundamental tradeoff between asynchrony and momentum: To ensure convergence and acceleration through asynchrony, we have to reduce the momentum (compared with Sync-MSGD). To the best of our knowledge, this is the first theoretical attempt on understanding Async-MSGD for distributed nonconvex stochastic optimization. Numerical experiments on both streaming PCA and training deep neural networks are provided to support our findings for Async-MSGD.

優化器 · Extensibility · 對偶問題 · 平滑 · INTERACT ·

2017 年 12 月 1 日

Optimal Algorithms for Distributed Optimization

César A. Uribe,Soomin Lee,Alexander Gasnikov,Angelia Nedi?

In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.