We propose in this paper New Q-Newton's method. The update rule is very simple conceptually, for example $x_{n+1}=x_n-w_n$ where $w_n=pr_{A_n,+}(v_n)-pr_{A_n,-}(v_n)$, with $A_n=\nabla ^2f(x_n)+\delta _n||\nabla f(x_n)||^2.Id$ and $v_n=A_n^{-1}.\nabla f(x_n)$. Here $\delta _n$ is an appropriate real number so that $A_n$ is invertible, and $pr_{A_n,\pm}$ are projections to the vector subspaces generated by eigenvectors of positive (correspondingly negative) eigenvalues of $A_n$. The main result of this paper roughly says that if $f$ is $C^3$ (can be unbounded from below) and a sequence $\{x_n\}$, constructed by the New Q-Newton's method from a random initial point $x_0$, {\bf converges}, then the limit point is a critical point and is not a saddle point, and the convergence rate is the same as that of Newton's method. The first author has recently been successful incorporating Backtracking line search to New Q-Newton's method, thus resolving the convergence guarantee issue observed for some (non-smooth) cost functions. An application to quickly finding zeros of a univariate meromorphic function will be discussed. Various experiments are performed, against well known algorithms such as BFGS and Adaptive Cubic Regularization are presented.
In this work, we consider the optimization formulation for symmetric tensor decomposition recently introduced in the Subspace Power Method (SPM) of Kileel and Pereira. Unlike popular alternative functionals for tensor decomposition, the SPM objective function has the desirable properties that its maximal value is known in advance, and its global optima are exactly the rank-1 components of the tensor when the input is sufficiently low-rank. We analyze the non-convex optimization landscape associated with the SPM objective. Our analysis accounts for working with noisy tensors. We derive quantitative bounds such that any second-order critical point with SPM objective value exceeding the bound must equal a tensor component in the noiseless case, and must approximate a tensor component in the noisy case. For decomposing tensors of size $D^{\times m}$, we obtain a near-global guarantee up to rank $\widetilde{o}(D^{\lfloor m/2 \rfloor})$ under a random tensor model, and a global guarantee up to rank $\mathcal{O}(D)$ assuming deterministic frame conditions. This implies that SPM with suitable initialization is a provable, efficient, robust algorithm for low-rank symmetric tensor decomposition. We conclude with numerics that show a practical preferability for using the SPM functional over a more established counterpart.
This paper is concerned with the inverse problem of constructing a symmetric nonnegative matrix from realizable spectrum. We reformulate the inverse problem as an underdetermined nonlinear matrix equation over a Riemannian product manifold. To solve it, we develop a Riemannian underdetermined inexact Newton dogleg method for solving a general underdetermined nonlinear equation defined between Riemannian manifolds and Euclidean spaces. The global and quadratic convergence of the proposed method is established under some mild assumptions. Then we solve the inverse problem by applying the proposed method to its equivalent nonlinear matrix equation and a preconditioner for the perturbed normal Riemannian Newton equation is also constructed. Numerical tests show the efficiency of the proposed method for solving the inverse problem.
In this paper, we design and analyze a Hybrid-High Order (HHO) approximation for a class of quasilinear elliptic problems of nonmonotone type. The proposed method has several advantages, for instance, it supports arbitrary order of approximation and general polytopal meshes. The key ingredients involve local reconstruction and high-order stabilization terms. Existence and uniqueness of the discrete solution are shown by Brouwer's fixed point theorem and contraction result. A priori error estimate is shown in discrete energy norm that shows optimal order convergence rate. Numerical experiments are performed to substantiate the theoretical results.
Modeling univariate block maxima by the generalized extreme value distribution constitutes one of the most widely applied approaches in extreme value statistics. It has recently been found that, for an underlying stationary time series, respective estimators may be improved by calculating block maxima in an overlapping way. A proof of concept is provided that the latter finding also holds in situations that involve certain piecewise stationarities. A weak convergence result for an empirical process of central interest is provided, and, as a case-in-point, further details are worked out explicitly for the probability weighted moment estimator. Irrespective of the serial dependence, the estimation variance is shown to be smaller for the new estimator, while the bias was found to be the same or vary comparably little in extensive simulation experiments. The results are illustrated by Monte Carlo simulation experiments and are applied to a common situation involving temperature extremes in a changing climate.
We study the effect of stochasticity in on-policy policy optimization, and make the following four contributions. First, we show that the preferability of optimization methods depends critically on whether stochastic versus exact gradients are used. In particular, unlike the true gradient setting, geometric information cannot be easily exploited in the stochastic case for accelerating policy optimization without detrimental consequences or impractical assumptions. Second, to explain these findings we introduce the concept of committal rate for stochastic policy optimization, and show that this can serve as a criterion for determining almost sure convergence to global optimality. Third, we show that in the absence of external oracle information, which allows an algorithm to determine the difference between optimal and sub-optimal actions given only on-policy samples, there is an inherent trade-off between exploiting geometry to accelerate convergence versus achieving optimality almost surely. That is, an uninformed algorithm either converges to a globally optimal policy with probability $1$ but at a rate no better than $O(1/t)$, or it achieves faster than $O(1/t)$ convergence but then must fail to converge to the globally optimal policy with some positive probability. Finally, we use the committal rate theory to explain why practical policy optimization methods are sensitive to random initialization, then develop an ensemble method that can be guaranteed to achieve near-optimal solutions with high probability.
A weakly infeasible semidefinite program (SDP) has no feasible solution, but it has approximate solutions whose constraint violation is arbitrarily small. These SDPs are ill-posed and numerically often unsolvable. They are also closely related to "bad" linear projections that map the cone of positive semidefinite matrices to a nonclosed set. We describe a simple echelon form of weakly infeasible SDPs with the following properties: (i) it is obtained by elementary row operations and congruence transformations, (ii) it makes weak infeasibility evident, and (iii) it permits us to construct any weakly infeasible SDP or bad linear projection by an elementary combinatorial algorithm. Based on our echelon form we generate a challenging library of weakly infeasible SDPs. Finally, we show that some SDPs in the literature are in our echelon form, for example, the SDP from the sum-of-squares relaxation of minimizing the famous Motzkin polynomial.
The main two algorithms for computing the numerical radius are the level-set method of Mengi and Overton and the cutting-plane method of Uhlig. Via new analyses, we explain why the cutting-plane approach is sometimes much faster or much slower than the level-set one and then propose a new hybrid algorithm that remains efficient in all cases. For matrices whose fields of values are a circular disk centered at the origin, we show that the cost of Uhlig's method blows up with respect to the desired relative accuracy. More generally, we also analyze the local behavior of Uhlig's cutting procedure at outermost points in the field of values, showing that it often has a fast Q-linear rate of convergence and is Q-superlinear at corners. Finally, we identify and address inefficiencies in both the level-set and cutting-plane approaches and propose refined versions of these techniques.
Stokes flows are a type of fluid flow where convective forces are small in comparison with viscous forces, and momentum transport is entirely due to viscous diffusion. Besides being routinely used as benchmark test cases in numerical fluid dynamics, Stokes flows are relevant in several applications in science and engineering including porous media flow, biological flows, microfluidics, microrobotics, and hydrodynamic lubrication. The present study concerns the discretization of the equations of motion of Stokes flows in three dimensions utilizing the MINI mixed finite element, focusing on the superconvergence of the method which was investigated with numerical experiments using five purpose-made benchmark test cases with analytical solution. Despite the fact that the MINI element is only linearly convergent according to standard mixed finite element theory, a recent theoretical development proves that, for structured meshes in two dimensions, the pressure superconverges with order 1.5, as well as the linear part of the computed velocity with respect to the piecewise-linear nodal interpolation of the exact velocity. The numerical experiments documented herein suggest a more general validity of the superconvergence in pressure, possibly to unstructured tetrahedral meshes and even up to quadratic convergence which was observed with one test problem, thereby indicating that there is scope to further extend the available theoretical results on convergence.
In this paper we are concerned with energy-conserving methods for Poisson problems, which are effectively solved by defining a suitable generalization of HBVMs, a class of energy-conserving methods for Hamiltonian problems. The actual implementation of the methods is fully discussed, with a particular emphasis on the conservation of Casimirs. Some numerical tests are reported, in order to assess the theoretical findings.
Backtracking is an inexact line search procedure that selects the first value in a sequence $x_0, x_0\beta, x_0\beta^2...$ that satisfies $g(x)\leq 0$ on $\mathbb{R}_+$ with $g(x)\leq 0$ iff $x\leq x^*$. This procedure is widely used in descent direction optimization algorithms with Armijo-type conditions. It both returns an estimate in $(\beta x^*,x^*]$ and enjoys an upper-bound $\lceil \log_{\beta} \epsilon/x_0 \rceil$ on the number of function evaluations to terminate, with $\epsilon$ a lower bound on $x^*$. The basic bracketing mechanism employed in several root-searching methods is adapted here for the purpose of performing inexact line searches, leading to a new class of inexact line search procedures. The traditional bisection algorithm for root-searching is transposed into a very simple method that completes the same inexact line search in at most $\lceil \log_2 \log_{\beta} \epsilon/x_0 \rceil$ function evaluations. A recent bracketing algorithm for root-searching which presents both minmax function evaluation cost (as the bisection algorithm) and superlinear convergence is also transposed, asymptotically requiring $\sim \log \log \log \epsilon/x_0 $ function evaluations for sufficiently smooth functions. Other bracketing algorithms for root-searching can be adapted in the same way. Numerical experiments suggest time savings of 50\% to 80\% in each call to the inexact search procedure.