免费看水蜜桃爱如潮水带你飞IOS_亚洲国产中文精品在线观看香蕉_精品国产亚洲一区人成在线观看_涩涩伊人久久无码欧美_国产精品三级一区二区蜜奴_亚洲一级二级在线观看呢_国产色婷婷丁香久久综合

We study the query complexity of geodesically convex (g-convex) optimization on a manifold. To isolate the effect of that manifold's curvature, we primarily focus on hyperbolic spaces. In a variety of settings (smooth or not; strongly g-convex or not; high- or low-dimensional), known upper bounds worsen with curvature. It is natural to ask whether this is warranted, or an artifact. For many such settings, we propose a first set of lower bounds which indeed confirm that (negative) curvature is detrimental to complexity. To do so, we build on recent lower bounds (Hamilton and Moitra, 2021; Criscitiello and Boumal, 2022) for the particular case of smooth, strongly g-convex optimization. Using a number of techniques, we also secure lower bounds which capture dependence on condition number and optimality gap, which was not previously the case. We suspect these bounds are not optimal. We conjecture optimal ones, and support them with a matching lower bound for a class of algorithms which includes subgradient descent, and a lower bound for a related game. Lastly, to pinpoint the difficulty of proving lower bounds, we study how negative curvature influences (and sometimes obstructs) interpolation with g-convex functions.

相關內容

曲率

關注 1

MoDELS · 直徑 · motivation · 近鄰 · DATE ·

2023 年 7 月 26 日

Coarse-Grained Complexity for Dynamic Algorithms

Sayan Bhattacharya,Danupon Nanongkai,Thatchaphol Saranurak

from arxiv, Published at SODA 2020. The abstract is truncated

To date, the only way to argue polynomial lower bounds for dynamic algorithms is via fine-grained complexity arguments. These arguments rely on strong assumptions about specific problems such as the Strong Exponential Time Hypothesis (SETH) and the Online Matrix-Vector Multiplication Conjecture (OMv). While they have led to many exciting discoveries, dynamic algorithms still miss out some benefits and lessons from the traditional ``coarse-grained'' approach that relates together classes of problems such as P and NP. In this paper we initiate the study of coarse-grained complexity theory for dynamic algorithms. Below are among questions that this theory can answer. What if dynamic Orthogonal Vector (OV) is easy in the cell-probe model? A research program for proving polynomial unconditional lower bounds for dynamic OV in the cell-probe model is motivated by the fact that many conditional lower bounds can be shown via reductions from the dynamic OV problem. Since the cell-probe model is more powerful than word RAM and has historically allowed smaller upper bounds, it might turn out that dynamic OV is easy in the cell-probe model, making this research direction infeasible. Our theory implies that if this is the case, there will be very interesting algorithmic consequences: If dynamic OV can be maintained in polylogarithmic worst-case update time in the cell-probe model, then so are several important dynamic problems such as $k$-edge connectivity, $(1+\epsilon)$-approximate mincut, $(1+\epsilon)$-approximate matching, planar nearest neighbors, Chan's subset union and 3-vs-4 diameter. The same conclusion can be made when we replace dynamic OV by, e.g., subgraph connectivity, single source reachability, Chan's subset union, and 3-vs-4 diameter. Lower bounds for $k$-edge connectivity via dynamic OV? (see the full abstract in the pdf file).

Batch Size · 線搜索 · 非凸 · SGD · 優化器 ·

2023 年 7 月 25 日

Relationship between Batch Size and Number of Steps Needed for Nonconvex Optimization of Stochastic Gradient Descent using Armijo Line Search

Yuki Tsukada,Hideaki Iiduka

Stochastic gradient descent (SGD) is the simplest deep learning optimizer with which to train deep neural networks. While SGD can use various learning rates, such as constant or diminishing rates, the previous numerical results showed that SGD performs better than other deep learning optimizers using when it uses learning rates given by line search methods. In this paper, we perform a convergence analysis on SGD with a learning rate given by an Armijo line search for nonconvex optimization. The analysis indicates that the upper bound of the expectation of the squared norm of the full gradient becomes small when the number of steps and the batch size are large. Next, we show that, for SGD with the Armijo-line-search learning rate, the number of steps needed for nonconvex optimization is a monotone decreasing convex function of the batch size; that is, the number of steps needed for nonconvex optimization decreases as the batch size increases. Furthermore, we show that the stochastic first-order oracle (SFO) complexity, which is the stochastic gradient computation cost, is a convex function of the batch size; that is, there exists a critical batch size that minimizes the SFO complexity. Finally, we provide numerical results that support our theoretical results. The numerical results indicate that the number of steps needed for training deep neural networks decreases as the batch size increases and that there exist the critical batch sizes that can be estimated from the theoretical results.

豪斯多夫距離 · Performer · 泛化理論 · 最優化 · 優化器 ·

2023 年 7 月 25 日

Computing the Gromov--Hausdorff distance using first-order methods

Vladyslav Oles

The Gromov--Hausdorff distance measures the difference in shape between compact metric spaces and poses a notoriously difficult problem in combinatorial optimization. We introduce its quadratic relaxation over a convex polytope whose solutions provably deliver the Gromov--Hausdorff distance. The optimality guarantee is enabled by the fact that the search space of our approach is not constrained to a generalization of bijections, unlike in other relaxations such as the Gromov--Wasserstein distance. We suggest the Frank--Wolfe algorithm with $O(n^3)$-time iterations for solving the relaxation and numerically demonstrate its performance on metric spaces of hundreds of points. In particular, we obtain a new upper bound of the Gromov--Hausdorff distance between the unit circle and the unit hemisphere equipped with Euclidean metric. Our approach is implemented as a Python package dGH.

最優化 · Lipschitz · 曲率 · 凸函數 · 模型評估 ·

2023 年 7 月 24 日

Open Problem: Polynomial linearly-convergent method for geodesically convex optimization?

Christopher Criscitiello,David Martínez-Rubio,Nicolas Boumal

Let $f \colon \mathcal{M} \to \mathbb{R}$ be a Lipschitz and geodesically convex function defined on a $d$-dimensional Riemannian manifold $\mathcal{M}$. Does there exist a first-order deterministic algorithm which (a) uses at most $O(\mathrm{poly}(d) \log(\epsilon^{-1}))$ subgradient queries to find a point with target accuracy $\epsilon$, and (b) requires only $O(\mathrm{poly}(d))$ arithmetic operations per query? In convex optimization, the classical ellipsoid method achieves this. After detailing related work, we provide an ellipsoid-like algorithm with query complexity $O(d^2 \log^2(\epsilon^{-1}))$ and per-query complexity $O(d^2)$ for the limited case where $\mathcal{M}$ has constant curvature (hemisphere or hyperbolic space). We then detail possible approaches and corresponding obstacles for designing an ellipsoid-like method for general Riemannian manifolds.

控制器 · 線性的 · 穩健性 · MoDELS · motivation ·

2023 年 7 月 23 日

Robust explicit model predictive control for hybrid linear systems with parameter uncertainties

Oleg Balakhnov,Sergei Savin,Alexandr Klimchik

Explicit model-predictive control (MPC) is a widely used control design method that employs optimization tools to find control policies offline; commonly it is posed as a semi-definite program (SDP) or as a mixed-integer SDP in the case of hybrid systems. However, mixed-integer SDPs are computationally expensive, motivating alternative formulations, such as zonotope-based MPC (zonotopes are a special type of symmetric polytopes). In this paper, we propose a robust explicit MPC method applicable to hybrid systems. More precisely, we extend existing zonotope-based MPC methods to account for multiplicative parametric uncertainty. Additionally, we propose a convex zonotope order reduction method that takes advantage of the iterative structure of the zonotope propagation problem to promote diagonal blocks in the zonotope generators and lower the number of decision variables. Finally, we developed a quasi-time-free policy choice algorithm, allowing the system to start from any point on the trajectory and avoid chattering associated with discrete switching of linear control policies based on the current state's membership in state-space regions. Last but not least, we verify the validity of the proposed methods on two experimental setups, varying physical parameters between experiments.

線性的 · 正則化項 · 分解的 · 極小點 · Better ·

2023 年 7 月 23 日

Regularized randomized iterative algorithms for factorized linear systems

Kui Du

from arxiv, 17 pages, 4 figures

Randomized iterative algorithms for solving a factorized linear system, $\mathbf A\mathbf B\mathbf x=\mathbf b$ with $\mathbf A\in{\mathbb{R}}^{m\times \ell}$, $\mathbf B\in{\mathbb{R}}^{\ell\times n}$, and $\mathbf b\in{\mathbb{R}}^m$, have recently been proposed. They take advantage of the factorized form and avoid forming the matrix $\mathbf C=\mathbf A\mathbf B$ explicitly. However, they can only find the minimum norm (least squares) solution. In contrast, the regularized randomized Kaczmarz (RRK) algorithm can find solutions with certain structures from consistent linear systems. In this work, by combining the randomized Kaczmarz algorithm or the randomized Gauss--Seidel algorithm with the RRK algorithm, we propose two novel regularized randomized iterative algorithms to find (least squares) solutions with certain structures of $\mathbf A\mathbf B\mathbf x=\mathbf b$. We prove linear convergence of the new algorithms. Computed examples are given to illustrate that the new algorithms can find sparse (least squares) solutions of $\mathbf A\mathbf B\mathbf x=\mathbf b$ and can be better than the existing randomized iterative algorithms for the corresponding full linear system $\mathbf C\mathbf x=\mathbf b$ with $\mathbf C=\mathbf A\mathbf B$.

離散化 · 變換 · 分布式哈希表技術 · 線性的 · 有向 ·

2023 年 7 月 23 日

Spectral solver for Cauchy problems in polar coordinates using discrete Hankel transforms

Rundong Zhou,Nicolas Grisouard

We introduce a Fourier-Bessel-based spectral solver for Cauchy problems featuring Laplacians in polar coordinates under homogeneous Dirichlet boundary conditions. We use FFTs in the azimuthal direction to isolate angular modes, then perform discrete Hankel transform (DHT) on each mode along the radial direction to obtain spectral coefficients. The two transforms are connected via numerical and cardinal interpolations. We analyze the boundary-dependent error bound of DHT; the worst case is $\sim N^{-3/2}$, which governs the method, and the best $\sim e^{-N}$, which then the numerical interpolation governs. The complexity is $O[N^3]$. Taking advantage of Bessel functions being the eigenfunctions of the Laplacian operator, we solve linear equations for all times. For non-linear equations, we use a time-splitting method to integrate the solutions. We show examples and validate the method on the two-dimensional wave equation, which is linear, and on two non-linear problems: a time-dependent Poiseuille flow and the flow of a Bose-Einstein condensate on a disk.

Integration · 似然 · 樣本 · 蒙特卡羅積分 · binary ·

2023 年 7 月 21 日

Scenario Sampling for Large Supermodular Games

Bryan S. Graham,Andrin Pelican

from arxiv, 40 pages, 2 Figures and an 8 page Appendix

This paper introduces a simulation algorithm for evaluating the log-likelihood function of a large supermodular binary-action game. Covered examples include (certain types of) peer effect, technology adoption, strategic network formation, and multi-market entry games. More generally, the algorithm facilitates simulated maximum likelihood (SML) estimation of games with large numbers of players, $T$, and/or many binary actions per player, $M$ (e.g., games with tens of thousands of strategic actions, $TM=O(10^4)$). In such cases the likelihood of the observed pure strategy combination is typically (i) very small and (ii) a $TM$-fold integral who region of integration has a complicated geometry. Direct numerical integration, as well as accept-reject Monte Carlo integration, are computationally impractical in such settings. In contrast, we introduce a novel importance sampling algorithm which allows for accurate likelihood simulation with modest numbers of simulation draws.

隨機漫步 · Markov · 馬爾可夫鏈 · 約束 · Analysis ·

2023 年 7 月 21 日

Explicit Constraints on the Geometric Rate of Convergence of Random Walk Metropolis-Hastings

Riddhiman Bhattacharya,Galin L. Jones

Convergence rate analyses of random walk Metropolis-Hastings Markov chains on general state spaces have largely focused on establishing sufficient conditions for geometric ergodicity or on analysis of mixing times. Geometric ergodicity is a key sufficient condition for the Markov chain Central Limit Theorem and allows rigorous approaches to assessing Monte Carlo error. The sufficient conditions for geometric ergodicity of the random walk Metropolis-Hastings Markov chain are refined and extended, which allows the analysis of previously inaccessible settings such as Bayesian Poisson regression. The key technical innovation is the development of explicit drift and minorization conditions for random walk Metropolis-Hastings, which allows explicit upper and lower bounds on the geometric rate of convergence. Further, lower bounds on the geometric rate of convergence are also developed using spectral theory. The existing sufficient conditions for geometric ergodicity, to date, have not provided explicit constraints on the rate of geometric rate of convergence because the method used only implies the existence of drift and minorization conditions. The theoretical results are applied to random walk Metropolis-Hastings algorithms for a class of exponential families and generalized linear models that address Bayesian Regression problems.

可約的 · 離散化 · 動力系統 · 梯度場 · MoDELS ·

2023 年 7 月 21 日

Gradient-preserving hyper-reduction of nonlinear dynamical systems via discrete empirical interpolation

Cecilia Pagliantini,Federico Vismara

This work proposes a hyper-reduction method for nonlinear parametric dynamical systems characterized by gradient fields such as Hamiltonian systems and gradient flows. The gradient structure is associated with conservation of invariants or with dissipation and hence plays a crucial role in the description of the physical properties of the system. Traditional hyper-reduction of nonlinear gradient fields yields efficient approximations that, however, lack the gradient structure. We focus on Hamiltonian gradients and we propose to first decompose the nonlinear part of the Hamiltonian, mapped into a suitable reduced space, into the sum of d terms, each characterized by a sparse dependence on the system state. Then, the hyper-reduced approximation is obtained via discrete empirical interpolation (DEIM) of the Jacobian of the derived d-valued nonlinear function. The resulting hyper-reduced model retains the gradient structure and its computationally complexity is independent of the size of the full model. Moreover, a priori error estimates show that the hyper-reduced model converges to the reduced model and the Hamiltonian is asymptotically preserved. Whenever the nonlinear Hamiltonian gradient is not globally reducible, i.e. its evolution requires high-dimensional DEIM approximation spaces, an adaptive strategy is performed. This consists in updating the hyper-reduced Hamiltonian via a low-rank correction of the DEIM basis. Numerical tests demonstrate the applicability of the proposed approach to general nonlinear operators and runtime speedups compared to the full and the reduced models.