美国式禁忌电影在线观看免费观看-人人爽人人看人人硬

In the chasing convex bodies problem, an online player receives a request sequence of $N$ convex sets $K_1,\dots, K_N$ contained in a normed space $\mathbb R^d$. The player starts at $x_0\in \mathbb R^d$, and after observing each $K_n$ picks a new point $x_n\in K_n$. At each step the player pays a movement cost of $||x_n-x_{n-1}||$. The player aims to maintain a constant competitive ratio against the minimum cost possible in hindsight, i.e. knowing all requests in advance. The existence of a finite competitive ratio for convex body chasing was first conjectured in 1991 by Friedman and Linial. This conjecture was recently resolved with an exponential $2^{O(d)}$ upper bound on the competitive ratio. We give an improved algorithm achieving competitive ratio $d$ in any normed space, which is exactly tight for $\ell^{\infty}$. In Euclidean space, our algorithm also achieves competitive ratio $O(\sqrt{d\log N})$, nearly matching a $\sqrt{d}$ lower bound when $N$ is subexponential in $d$. The approach extends our prior work for nested convex bodies, which is based on the classical Steiner point of a convex body. We define the functional Steiner point of a convex function and apply it to the associated work function.

相關內容

ReQuEST

關注 0

優化器 · INFORMS · Extensibility · 在線 · 凸函數 ·

2022 年 1 月 28 日

Universal Online Convex Optimization with Minimax Optimal Second-Order Dynamic Regret

Hakan Gokcesu,Suleyman S. Kozat

from arxiv, 22 pages, 4 figure, preprint

We introduce an online convex optimization algorithm using projected subgradient descent with optimal adaptive learning rates, with sequential and efficient first-order updates. Our method provides a subgradient adaptive minimax optimal dynamic regret guarantee for a sequence of general convex functions with no known additional properties such as strong-convexity, smoothness, exp-concavity or even Lipschitz-continuity. The guarantee is against any comparator decision sequence with bounded "complexity", defined by the cumulative distance traveled via changes between successive decisions. We show optimality by generating a lower bound of the worst-case second-order dynamic regret, which incorporates actual subgradient norms and matches with our guarantees within a constant factor. We also derive the extension for independent learning in each decision coordinate separately. Additionally, we demonstrate how to best preserve our guarantees when the bound on total successive changes in the dynamic comparator sequence grows in time or the feedback regarding such bound arrives partially with time, both in a truly online manner. Then, as a major contribution, we examine the scenario when we receive no information regarding the successive changes, but instead, by a unique re-purposing of the expert mixture framework with novel additions, we eliminate the need of such information in, again, a truly online manner. Moreover, we show the ability to compete against all dynamic comparator sequences simultaneously (universally) with minimax optimality, where the guarantees depend on the "complexity" of each comparator separately. We also discuss potential modifications to our approach which addresses further complexity reductions for time, computation, memory, and we also further the universal competitiveness via guarantees taking into account concentrations of a comparator sequence in the decision set.

相互獨立的 · 納什均衡 · 學成 · 平穩的 · 幾乎必然 ·

2022 年 1 月 28 日

Learning Stationary Nash Equilibrium Policies in $n$-Player Stochastic Games with Independent Chains via Dual Mirror Descent

S. Rasoul Etesami

We consider a subclass of $n$-player stochastic games, in which players have their own internal state/action spaces while they are coupled through their payoff functions. It is assumed that players' internal chains are driven by independent transition probabilities. Moreover, players can only receive realizations of their payoffs but not the actual functions, nor can they observe each others' states/actions. Under some assumptions on the structure of the payoff functions, we develop efficient learning algorithms based on Dual Averaging and Dual Mirror Descent, which provably converge almost surely or in expectation to the set of $\epsilon$-Nash equilibrium policies. In particular, we derive upper bounds on the number of iterates that scale polynomially in terms of the game parameters to achieve an $\epsilon$-Nash equilibrium policy. Besides Markov potential games and linear-quadratic stochastic games, this work provides another interesting subclass of $n$-player stochastic games that under some assumption provably admit polynomial-time learning algorithm for finding their $\epsilon$-Nash equilibrium policies.

近似貝葉斯計算 · 統計量 · 近似 · 環 · 可約的 ·

2022 年 1 月 28 日

Approximate Bayesian Computation with Domain Expert in the Loop

Ayush Bharti,Louis Filstroff,Samuel Kaski

Approximate Bayesian computation (ABC) is a popular likelihood-free inference method for models with intractable likelihood functions. As ABC methods usually rely on comparing summary statistics of observed and simulated data, the choice of the statistics is crucial. This choice involves a trade-off between loss of information and dimensionality reduction, and is often determined based on domain knowledge. However, handcrafting and selecting suitable statistics is a laborious task involving multiple trial-and-error steps. In this work, we introduce an active learning method for ABC statistics selection which reduces the domain expert's work considerably. By involving the experts, we are able to handle misspecified models, unlike the existing dimension reduction methods. Moreover, empirical results show better posterior estimates than with existing methods, when the simulation budget is limited.

優化器 · 圖片分類 · Machine Learning · Networks · 縮放 ·

2022 年 1 月 28 日

Optimal Complexity in Decentralized Training

Yucheng Lu,Christopher De Sa

Decentralization is a promising method of scaling up parallel machine learning systems. In this paper, we provide a tight lower bound on the iteration complexity for such methods in a stochastic non-convex setting. Our lower bound reveals a theoretical gap in known convergence rates of many existing decentralized training algorithms, such as D-PSGD. We prove by construction this lower bound is tight and achievable. Motivated by our insights, we further propose DeTAG, a practical gossip-style decentralized algorithm that achieves the lower bound with only a logarithm gap. Empirically, we compare DeTAG with other decentralized algorithms on image classification tasks, and we show DeTAG enjoys faster convergence compared to baselines, especially on unshuffled data and in sparse networks.

控制器 · 優化器 · 穩健性 · 周期的 · MoDELS ·

2022 年 1 月 27 日

Optimal control of Hopf bifurcations

Nicolas Boullé,Patrick E. Farrell,Marie E. Rognes

from arxiv, 22 pages, 8 figures

We introduce a numerical technique for controlling the location and stability properties of Hopf bifurcations in dynamical systems. The algorithm consists of solving an optimization problem constrained by an extended system of nonlinear partial differential equations that characterizes Hopf bifurcation points. The flexibility and robustness of the method allows us to advance or delay a Hopf bifurcation to a target value of the bifurcation parameter, as well as controlling the oscillation frequency with respect to a parameter of the system or the shape of the domain on which solutions are defined. Numerical applications are presented in systems arising from biology and fluid dynamics, such as the FitzHugh-Nagumo model, Ginzburg-Landau equation, Rayleigh-B\'enard convection problem, and Navier-Stokes equations, where the control of the location and oscillation frequency of periodic solutions is of high interest.

非凸 · 約束優化 · 泛函 · 優化器 · 近似 ·

2022 年 1 月 27 日

Stochastic First-order Methods for Convex and Nonconvex Functional Constrained Optimization

Digvijay Boob,Qi Deng,Guanghui Lan

from arxiv, 36 pages, final version, accepted at Math Programming

Functional constrained optimization is becoming more and more important in machine learning and operations research. Such problems have potential applications in risk-averse machine learning, semisupervised learning, and robust optimization among others. In this paper, we first present a novel Constraint Extrapolation (ConEx) method for solving convex functional constrained problems, which utilizes linear approximations of the constraint functions to define the extrapolation (or acceleration) step. We show that this method is a unified algorithm that achieves the best-known rate of convergence for solving different functional constrained convex composite problems, including convex or strongly convex, and smooth or nonsmooth problems with a stochastic objective and/or stochastic constraints. Many of these rates of convergence were in fact obtained for the first time in the literature. In addition, ConEx is a single-loop algorithm that does not involve any penalty subproblems. Contrary to existing primal-dual methods, it does not require the projection of Lagrangian multipliers into a (possibly unknown) bounded set. Second, for nonconvex functional constrained problems, we introduce a new proximal point method that transforms the initial nonconvex problem into a sequence of convex problems by adding quadratic terms to both the objective and constraints. Under a certain MFCQ-type assumption, we establish the convergence and rate of convergence of this method to KKT points when the convex subproblems are solved exactly or inexactly. For large-scale and stochastic problems, we present a more practical proximal point method in which the approximate solutions of the subproblems are computed by the aforementioned ConEx method. To the best of our knowledge, most of these convergence and complexity results of the proximal point method for nonconvex problems also seem to be new in the literature.

近似 · CC · Performer · 納什均衡 · Weight ·

2022 年 1 月 26 日

An Efficient Approximation Algorithm for the Colonel Blotto Game

Daniel Beaglehole

In the storied Colonel Blotto game, two colonels allocate $a$ and $b$ troops, respectively, to $k$ distinct battlefields. A colonel wins a battle if they assign more troops to that particular battle, and each colonel seeks to maximize their total number of victories. Despite the problem's formulation in 1921, the first polynomial-time algorithm to compute Nash equilibrium (NE) strategies for this game was discovered only quite recently. In 2016, \cite{ahmadinejad_dehghani_hajiaghayi_lucier_mahini_seddighin_2019} formulated a breakthrough algorithm to compute NE strategies for the Colonel Blotto game in computational complexity $O(k^{14}\max\{a,b\}^{13})$, receiving substantial media coverage (e.g. \cite{Insider}, \cite{NSF}, \cite{ScienceDaily}). As of this work, this is the only known algorithm (to our knowledge) for the Colonel Blotto game with general parameters. In this work, we present the first known algorithm to compute $\eps$-approximate NE strategies in the two-player Colonel Blotto game in runtime $\widetilde{O}(\eps^{-4} k^8 \max\{a,b\})$ for arbitrary settings of these parameters. Moreover, this algorithm is the first known efficient algorithm to compute approximate coarse correlated equilibrium strategies in the multiplayer Colonel Blotto game (when there are more than two colonels) with runtime $\widetilde{O}(\ell \eps^{-4} k^8 \max\{a,b\} + \ell^2 \eps^{-2} k^3 \max\{a,b\})$. Prior to this work, no polynomial-time algorithm was known to compute exact or approximate equilibrium (in any sense) strategies for multiplayer Colonel Blotto with arbitrary parameters. Our algorithm computes these approximate equilibria by implicitly performing multiplicative weights update over the exponentially many strategies available to each player.

估計/估計量 · JACM · 向量化 · 查詢向量 · 極小點 ·

2022 年 1 月 26 日

Stochastic diagonal estimation: probabilistic bounds and an improved algorithm

Robert A. Baston,Yuji Nakatsukasa

from arxiv, 40 pages

We study the problem of estimating the diagonal of an implicitly given matrix $A$. For such a matrix we have access to an oracle that allows us to evaluate the matrix vector product $Av$. For random variable $v$ drawn from an appropriate distribution, this may be used to return an estimate of the diagonal of the matrix $A$. Whilst results exist for probabilistic guarantees relating to the error of estimates of the trace of $A$, no such results have yet been derived for the diagonal. We analyse the number of queries $s$ required to guarantee that with probability at least $1-\delta$ the estimates of the relative error of the diagonal entries is at most $\varepsilon$. We extend this analysis to the 2-norm of the difference between the estimate and the diagonal of $A$. We prove, discuss and experiment with bounds on the number of queries $s$ required to guarantee a probabilistic bound on the estimates of the diagonal by employing Rademacher and Gaussian random variables. Two sufficient upper bounds on the minimum number of query vectors are proved, extending the work of Avron and Toledo [JACM 58(2)8, 2011], and later work of Roosta-Khorasani and Ascher [FoCM 15, 1187-1212, 2015]. We find that, generally, there is little difference between the two, with convergence going as $O(\log(1/\delta)/\varepsilon^2)$ for individual diagonal elements. However for small $s$, we find that the Rademacher estimator is superior. These results allow us to then extend the ideas of Meyer, Musco, Musco and Woodruff [SOSA, 142-155, 2021], suggesting algorithm Diag++, to speed up the convergence of diagonal estimation from $O(1/\varepsilon^2)$ to $O(1/\varepsilon)$ and make it robust to the spectrum of any positive semi-definite matrix $A$.

圖 · 學成 · 劃分 · 優化器 · state-of-the-art ·

2019 年 10 月 9 日

Scalable Gromov-Wasserstein Learning for Graph Partitioning and Matching

Hongteng Xu,Dixin Luo,Lawrence Carin

from arxiv, 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada

We propose a scalable Gromov-Wasserstein learning (S-GWL) method and establish a novel and theoretically-supported paradigm for large-scale graph analysis. The proposed method is based on the fact that Gromov-Wasserstein discrepancy is a pseudometric on graphs. Given two graphs, the optimal transport associated with their Gromov-Wasserstein discrepancy provides the correspondence between their nodes and achieves graph matching. When one of the graphs has isolated but self-connected nodes ($i.e.$, a disconnected graph), the optimal transport indicates the clustering structure of the other graph and achieves graph partitioning. Using this concept, we extend our method to multi-graph partitioning and matching by learning a Gromov-Wasserstein barycenter graph for multiple observed graphs; the barycenter graph plays the role of the disconnected graph, and since it is learned, so is the clustering. Our method combines a recursive $K$-partition mechanism with a regularized proximal gradient algorithm, whose time complexity is $\mathcal{O}(K(E+V)\log_K V)$ for graphs with $V$ nodes and $E$ edges. To our knowledge, our method is the first attempt to make Gromov-Wasserstein discrepancy applicable to large-scale graph analysis and unify graph partitioning and matching into the same framework. It outperforms state-of-the-art graph partitioning and matching methods, achieving a trade-off between accuracy and efficiency.

秩 · MoDELS · 優化器 · 奇異值分解 · 列 ·

2018 年 10 月 18 日

Testing Matrix Rank, Optimally

Maria-Florina Balcan,Yi Li,David P. Woodruff,Hongyang Zhang

from arxiv, 51 pages. To appear in SODA 2019

We show that for the problem of testing if a matrix $A \in F^{n \times n}$ has rank at most $d$, or requires changing an $\epsilon$-fraction of entries to have rank at most $d$, there is a non-adaptive query algorithm making $\widetilde{O}(d^2/\epsilon)$ queries. Our algorithm works for any field $F$. This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix. Our algorithm is the first such algorithm which does not read a submatrix, and instead reads a carefully selected non-adaptive pattern of entries in rows and columns of $A$. We complement our algorithm with a matching query complexity lower bound for non-adaptive testers over any field. We also give tight bounds of $\widetilde{\Theta}(d^2)$ queries in the sensing model for which query access comes in the form of $\langle X_i, A\rangle:=tr(X_i^\top A)$; perhaps surprisingly these bounds do not depend on $\epsilon$. We next develop a novel property testing framework for testing numerical properties of a real-valued matrix $A$ more generally, which includes the stable rank, Schatten-$p$ norms, and SVD entropy. Specifically, we propose a bounded entry model, where $A$ is required to have entries bounded by $1$ in absolute value. We give upper and lower bounds for a wide range of problems in this model, and discuss connections to the sensing model above.