亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We study the query complexity of geodesically convex (g-convex) optimization on a manifold. To isolate the effect of that manifold's curvature, we primarily focus on hyperbolic spaces. In a variety of settings (smooth or not; strongly g-convex or not; high- or low-dimensional), known upper bounds worsen with curvature. It is natural to ask whether this is warranted, or an artifact. For many such settings, we propose a first set of lower bounds which indeed confirm that (negative) curvature is detrimental to complexity. To do so, we build on recent lower bounds (Hamilton and Moitra, 2021; Criscitiello and Boumal, 2022) for the particular case of smooth, strongly g-convex optimization. Using a number of techniques, we also secure lower bounds which capture dependence on condition number and optimality gap, which was not previously the case. We suspect these bounds are not optimal. We conjecture optimal ones, and support them with a matching lower bound for a class of algorithms which includes subgradient descent, and a lower bound for a related game. Lastly, to pinpoint the difficulty of proving lower bounds, we study how negative curvature influences (and sometimes obstructs) interpolation with g-convex functions.

相關內容

To date, the only way to argue polynomial lower bounds for dynamic algorithms is via fine-grained complexity arguments. These arguments rely on strong assumptions about specific problems such as the Strong Exponential Time Hypothesis (SETH) and the Online Matrix-Vector Multiplication Conjecture (OMv). While they have led to many exciting discoveries, dynamic algorithms still miss out some benefits and lessons from the traditional ``coarse-grained'' approach that relates together classes of problems such as P and NP. In this paper we initiate the study of coarse-grained complexity theory for dynamic algorithms. Below are among questions that this theory can answer. What if dynamic Orthogonal Vector (OV) is easy in the cell-probe model? A research program for proving polynomial unconditional lower bounds for dynamic OV in the cell-probe model is motivated by the fact that many conditional lower bounds can be shown via reductions from the dynamic OV problem. Since the cell-probe model is more powerful than word RAM and has historically allowed smaller upper bounds, it might turn out that dynamic OV is easy in the cell-probe model, making this research direction infeasible. Our theory implies that if this is the case, there will be very interesting algorithmic consequences: If dynamic OV can be maintained in polylogarithmic worst-case update time in the cell-probe model, then so are several important dynamic problems such as $k$-edge connectivity, $(1+\epsilon)$-approximate mincut, $(1+\epsilon)$-approximate matching, planar nearest neighbors, Chan's subset union and 3-vs-4 diameter. The same conclusion can be made when we replace dynamic OV by, e.g., subgraph connectivity, single source reachability, Chan's subset union, and 3-vs-4 diameter. Lower bounds for $k$-edge connectivity via dynamic OV? (see the full abstract in the pdf file).

Stochastic gradient descent (SGD) is the simplest deep learning optimizer with which to train deep neural networks. While SGD can use various learning rates, such as constant or diminishing rates, the previous numerical results showed that SGD performs better than other deep learning optimizers using when it uses learning rates given by line search methods. In this paper, we perform a convergence analysis on SGD with a learning rate given by an Armijo line search for nonconvex optimization. The analysis indicates that the upper bound of the expectation of the squared norm of the full gradient becomes small when the number of steps and the batch size are large. Next, we show that, for SGD with the Armijo-line-search learning rate, the number of steps needed for nonconvex optimization is a monotone decreasing convex function of the batch size; that is, the number of steps needed for nonconvex optimization decreases as the batch size increases. Furthermore, we show that the stochastic first-order oracle (SFO) complexity, which is the stochastic gradient computation cost, is a convex function of the batch size; that is, there exists a critical batch size that minimizes the SFO complexity. Finally, we provide numerical results that support our theoretical results. The numerical results indicate that the number of steps needed for training deep neural networks decreases as the batch size increases and that there exist the critical batch sizes that can be estimated from the theoretical results.

The Gromov--Hausdorff distance measures the difference in shape between compact metric spaces and poses a notoriously difficult problem in combinatorial optimization. We introduce its quadratic relaxation over a convex polytope whose solutions provably deliver the Gromov--Hausdorff distance. The optimality guarantee is enabled by the fact that the search space of our approach is not constrained to a generalization of bijections, unlike in other relaxations such as the Gromov--Wasserstein distance. We suggest the Frank--Wolfe algorithm with $O(n^3)$-time iterations for solving the relaxation and numerically demonstrate its performance on metric spaces of hundreds of points. In particular, we obtain a new upper bound of the Gromov--Hausdorff distance between the unit circle and the unit hemisphere equipped with Euclidean metric. Our approach is implemented as a Python package dGH.

Let $f \colon \mathcal{M} \to \mathbb{R}$ be a Lipschitz and geodesically convex function defined on a $d$-dimensional Riemannian manifold $\mathcal{M}$. Does there exist a first-order deterministic algorithm which (a) uses at most $O(\mathrm{poly}(d) \log(\epsilon^{-1}))$ subgradient queries to find a point with target accuracy $\epsilon$, and (b) requires only $O(\mathrm{poly}(d))$ arithmetic operations per query? In convex optimization, the classical ellipsoid method achieves this. After detailing related work, we provide an ellipsoid-like algorithm with query complexity $O(d^2 \log^2(\epsilon^{-1}))$ and per-query complexity $O(d^2)$ for the limited case where $\mathcal{M}$ has constant curvature (hemisphere or hyperbolic space). We then detail possible approaches and corresponding obstacles for designing an ellipsoid-like method for general Riemannian manifolds.

Explicit model-predictive control (MPC) is a widely used control design method that employs optimization tools to find control policies offline; commonly it is posed as a semi-definite program (SDP) or as a mixed-integer SDP in the case of hybrid systems. However, mixed-integer SDPs are computationally expensive, motivating alternative formulations, such as zonotope-based MPC (zonotopes are a special type of symmetric polytopes). In this paper, we propose a robust explicit MPC method applicable to hybrid systems. More precisely, we extend existing zonotope-based MPC methods to account for multiplicative parametric uncertainty. Additionally, we propose a convex zonotope order reduction method that takes advantage of the iterative structure of the zonotope propagation problem to promote diagonal blocks in the zonotope generators and lower the number of decision variables. Finally, we developed a quasi-time-free policy choice algorithm, allowing the system to start from any point on the trajectory and avoid chattering associated with discrete switching of linear control policies based on the current state's membership in state-space regions. Last but not least, we verify the validity of the proposed methods on two experimental setups, varying physical parameters between experiments.

Randomized iterative algorithms for solving a factorized linear system, $\mathbf A\mathbf B\mathbf x=\mathbf b$ with $\mathbf A\in{\mathbb{R}}^{m\times \ell}$, $\mathbf B\in{\mathbb{R}}^{\ell\times n}$, and $\mathbf b\in{\mathbb{R}}^m$, have recently been proposed. They take advantage of the factorized form and avoid forming the matrix $\mathbf C=\mathbf A\mathbf B$ explicitly. However, they can only find the minimum norm (least squares) solution. In contrast, the regularized randomized Kaczmarz (RRK) algorithm can find solutions with certain structures from consistent linear systems. In this work, by combining the randomized Kaczmarz algorithm or the randomized Gauss--Seidel algorithm with the RRK algorithm, we propose two novel regularized randomized iterative algorithms to find (least squares) solutions with certain structures of $\mathbf A\mathbf B\mathbf x=\mathbf b$. We prove linear convergence of the new algorithms. Computed examples are given to illustrate that the new algorithms can find sparse (least squares) solutions of $\mathbf A\mathbf B\mathbf x=\mathbf b$ and can be better than the existing randomized iterative algorithms for the corresponding full linear system $\mathbf C\mathbf x=\mathbf b$ with $\mathbf C=\mathbf A\mathbf B$.

We introduce a Fourier-Bessel-based spectral solver for Cauchy problems featuring Laplacians in polar coordinates under homogeneous Dirichlet boundary conditions. We use FFTs in the azimuthal direction to isolate angular modes, then perform discrete Hankel transform (DHT) on each mode along the radial direction to obtain spectral coefficients. The two transforms are connected via numerical and cardinal interpolations. We analyze the boundary-dependent error bound of DHT; the worst case is $\sim N^{-3/2}$, which governs the method, and the best $\sim e^{-N}$, which then the numerical interpolation governs. The complexity is $O[N^3]$. Taking advantage of Bessel functions being the eigenfunctions of the Laplacian operator, we solve linear equations for all times. For non-linear equations, we use a time-splitting method to integrate the solutions. We show examples and validate the method on the two-dimensional wave equation, which is linear, and on two non-linear problems: a time-dependent Poiseuille flow and the flow of a Bose-Einstein condensate on a disk.

This paper introduces a simulation algorithm for evaluating the log-likelihood function of a large supermodular binary-action game. Covered examples include (certain types of) peer effect, technology adoption, strategic network formation, and multi-market entry games. More generally, the algorithm facilitates simulated maximum likelihood (SML) estimation of games with large numbers of players, $T$, and/or many binary actions per player, $M$ (e.g., games with tens of thousands of strategic actions, $TM=O(10^4)$). In such cases the likelihood of the observed pure strategy combination is typically (i) very small and (ii) a $TM$-fold integral who region of integration has a complicated geometry. Direct numerical integration, as well as accept-reject Monte Carlo integration, are computationally impractical in such settings. In contrast, we introduce a novel importance sampling algorithm which allows for accurate likelihood simulation with modest numbers of simulation draws.

Convergence rate analyses of random walk Metropolis-Hastings Markov chains on general state spaces have largely focused on establishing sufficient conditions for geometric ergodicity or on analysis of mixing times. Geometric ergodicity is a key sufficient condition for the Markov chain Central Limit Theorem and allows rigorous approaches to assessing Monte Carlo error. The sufficient conditions for geometric ergodicity of the random walk Metropolis-Hastings Markov chain are refined and extended, which allows the analysis of previously inaccessible settings such as Bayesian Poisson regression. The key technical innovation is the development of explicit drift and minorization conditions for random walk Metropolis-Hastings, which allows explicit upper and lower bounds on the geometric rate of convergence. Further, lower bounds on the geometric rate of convergence are also developed using spectral theory. The existing sufficient conditions for geometric ergodicity, to date, have not provided explicit constraints on the rate of geometric rate of convergence because the method used only implies the existence of drift and minorization conditions. The theoretical results are applied to random walk Metropolis-Hastings algorithms for a class of exponential families and generalized linear models that address Bayesian Regression problems.

This work proposes a hyper-reduction method for nonlinear parametric dynamical systems characterized by gradient fields such as Hamiltonian systems and gradient flows. The gradient structure is associated with conservation of invariants or with dissipation and hence plays a crucial role in the description of the physical properties of the system. Traditional hyper-reduction of nonlinear gradient fields yields efficient approximations that, however, lack the gradient structure. We focus on Hamiltonian gradients and we propose to first decompose the nonlinear part of the Hamiltonian, mapped into a suitable reduced space, into the sum of d terms, each characterized by a sparse dependence on the system state. Then, the hyper-reduced approximation is obtained via discrete empirical interpolation (DEIM) of the Jacobian of the derived d-valued nonlinear function. The resulting hyper-reduced model retains the gradient structure and its computationally complexity is independent of the size of the full model. Moreover, a priori error estimates show that the hyper-reduced model converges to the reduced model and the Hamiltonian is asymptotically preserved. Whenever the nonlinear Hamiltonian gradient is not globally reducible, i.e. its evolution requires high-dimensional DEIM approximation spaces, an adaptive strategy is performed. This consists in updating the hyper-reduced Hamiltonian via a low-rank correction of the DEIM basis. Numerical tests demonstrate the applicability of the proposed approach to general nonlinear operators and runtime speedups compared to the full and the reduced models.

北京阿比特科技有限公司