Computing Nash equilibrium in multi-agent games is a longstanding challenge at the interface of game theory and computer science. It is well known that a general normal form game in N players and k strategies requires exponential space simply to write down. This Curse of Multi-Agents prompts the study of succinct games which can be written down efficiently. A canonical example of a succinct game is the graphical game which models players as nodes in a graph interacting with only their neighbors in direct analogy with markov random fields. Graphical games have found applications in wireless, financial, and social networks. However, computing the nash equilbrium of graphical games has proven challenging. Even for polymatrix games, a model where payoffs to an agent can be written as the sum of payoffs of interactions with the agent's neighbors, it has been shown that computing an epsilon approximate nash equilibrium is PPAD hard for epsilon smaller than a constant. The focus of this work is to circumvent this computational hardness by considering average case graph models i.e random graphs. We provide a quasipolynomial time approximation scheme (QPTAS) for computing an epsilon approximate nash equilibrium of polymatrix games on random graphs with edge density greater than poly(k, 1/epsilon, ln(N))$ with high probability. Furthermore, with the same runtime we can compute an epsilon-approximate Nash equilibrium that epsilon-approximates the maximum social welfare of any nash equilibrium of the game. Our primary technical innovation is an "accelerated rounding" of a novel hierarchical convex program for the nash equilibrium problem. Our accelerated rounding also yields faster algorithms for Max-2CSP on the same family of random graphs, which may be of independent interest.
We consider the fixed-budget best arm identification problem with Normal rewards. In this problem, the forecaster is given $K$ arms (treatments) and $T$ time steps. The forecaster attempts to find the best arm in terms of the largest mean via an adaptive experiment conducted with an algorithm. The performance of the algorithm is measured by the simple regret, or the quality of the estimated best arm. It is known that the frequentist simple regret can be exponentially small to $T$ for any fixed parameters, whereas the Bayesian simple regret is $\Theta(T^{-1})$ over a continuous prior distribution. This paper shows that Bayes optimal algorithm, which minimizes the Bayesian simple regret, does not have an exponential simple regret for some parameters. This finding contrasts with the many results indicating the asymptotic equivalence of Bayesian and frequentist algorithms in fixed sampling regimes. While the Bayes optimal algorithm is described in terms of a recursive equation that is virtually impossible to compute exactly, we pave the way to an analysis by introducing a key quantity that we call the expected Bellman improvement.
In the storied Colonel Blotto game, two colonels allocate $a$ and $b$ troops, respectively, to $k$ distinct battlefields. A colonel wins a battle if they assign more troops to that particular battle, and each colonel seeks to maximize their total number of victories. Despite the problem's formulation in 1921, the first polynomial-time algorithm to compute Nash equilibrium (NE) strategies for this game was discovered only quite recently. In 2016, \citep{ahmadinejad_dehghani_hajiaghayi_lucier_mahini_seddighin_2019} formulated a breakthrough algorithm to compute NE strategies for the Colonel Blotto game, receiving substantial media coverage (e.g. \citep{Insider}, \citep{NSF}, \citep{ScienceDaily}). In this work, we present the first known $\epsilon$-approximation algorithm to compute NE strategies in the two-player Colonel Blotto game in runtime $\widetilde{O}(\epsilon^{-4} k^8 \max\{a,b\})$ for arbitrary settings of these parameters. Moreover, this algorithm computes approximate coarse correlated equilibrium strategies in the multiplayer Colonel Blotto game (when there are $\ell > 2$ colonels) with runtime $\widetilde{O}(\ell \epsilon^{-4} k^8 n + \ell^2 \epsilon^{-2} k^4 n)$, where $n$ is the maximum troop count. Before this work, no polynomial-time algorithm was known to compute exact or approximate equilibrium (in any sense) strategies for multiplayer Colonel Blotto with arbitrary parameters. Further, this is the first algorithm to compute optimal strategies (exact or approximate) in the continuous Colonel Blotto game with arbitrary parameters. Our algorithm computes these approximate equilibria by a novel (to the author's knowledge) sampling technique with which we implicitly perform multiplicative weights update over the exponentially many strategies available to each player.
Let $G=(V,E)$ be an undirected unweighted planar graph. Consider a vector storing the distances from an arbitrary vertex $v$ to all vertices $S = \{ s_1 , s_2 , \ldots , s_k \}$ of a single face in their cyclic order. The pattern of $v$ is obtained by taking the difference between every pair of consecutive values of this vector. In STOC'19, Li and Parter used a VC-dimension argument to show that in planar graphs, the number of distinct patterns, denoted $x$, is only $O(k^3)$. This resulted in a simple compression scheme requiring $\tilde O(\min \{ k^4+|T|, k\cdot |T|\})$ space to encode the distances between $S$ and a subset of terminal vertices $T \subseteq V$. This is known as the Okamura-Seymour metric compression problem. We give an alternative proof of the $x=O(k^3)$ bound that exploits planarity beyond the VC-dimension argument. Namely, our proof relies on cut-cycle duality, as well as on the fact that distances among vertices of $S$ are bounded by $k$. Our method implies the following: (1) An $\tilde{O}(x+k+|T|)$ space compression of the Okamura-Seymour metric, thus improving the compression of Li and Parter to $\tilde O(\min \{k^3+|T|,k \cdot |T| \})$. (2) An optimal $\tilde{O}(k+|T|)$ space compression of the Okamura-Seymour metric, in the case where the vertices of $T$ induce a connected component in $G$. (3) A tight bound of $x = \Theta(k^2)$ for the family of Halin graphs, whereas the VC-dimension argument is limited to showing $x=O(k^3)$.
The approximate degree of a Boolean function is the least degree of a real multilinear polynomial approximating it in the $\ell_\infty$-norm over the Boolean hypercube. We show that the approximate degree of the Bipartite Perfect Matching function, which is the indicator over all bipartite graphs having a perfect matching, is $\widetilde{\Theta}(n^{3/2})$. The upper bound is obtained by fully characterizing the unique multilinear polynomial representing the Boolean dual of the perfect matching function, over the reals. Crucially, we show that this polynomial has very small $\ell_1$-norm -- only exponential in $\Theta(n \log n)$. The lower bound follows by bounding the spectral sensitivity of the perfect matching function, which is the spectral radius of its cut-graph on the hypercube \cite{aaronson2020degree, huang2019induced}. We show that the spectral sensitivity of perfect matching is exactly $\Theta(n^{3/2})$.
We provide a formal definition of depth-limited games together with an accessible and rigorous explanation of the underlying concepts, both of which were previously missing in imperfect-information games. The definition works for an arbitrary extensive-form game and is not tied to any specific game-solving algorithm. Moreover, this framework unifies and significantly extends three approaches to depth-limited solving that previously existed in extensive-form games and multiagent reinforcement learning but were not known to be compatible. A key ingredient of these depth-limited games are value functions. Focusing on two-player zero-sum imperfect-information games, we show how to obtain optimal value functions and prove that public information provides both necessary and sufficient context for computing them. We provide a domain-independent encoding of the domains that allows for approximating value functions even by simple feed-forward neural networks, which are then able to generalize to unseen parts of the game. We use the resulting value network to implement a depth-limited version of counterfactual regret minimization. In three distinct domains, we show that the algorithm's exploitability is roughly linearly dependent on the value network's quality and that it is not difficult to train a value network with which depth-limited CFR's performance is as good as that of CFR with access to the full game.
We introduce Boolean Observation Games, a subclass of multi-player finite strategic games with incomplete information and qualitative objectives. In Boolean observation games, each player is associated with a finite set of propositional variables of which only it can observe the value, and it controls whether and to whom it can reveal that value. It does not control the given, fixed, value of variables. Boolean observation games are a generalization of Boolean games, a well-studied subclass of strategic games but with complete information, and wherein each player controls the value of its variables. In Boolean observation games player goals describe multi-agent knowledge of variables. As in classical strategic games, players choose their strategies simultaneously and therefore observation games capture aspects of both imperfect and incomplete information. They require reasoning about sets of outcomes given sets of indistinguishable valuations of variables. What a Nash equilibrium is, depends on an outcome relation between such sets. We present various outcome relations, including a qualitative variant of ex-post equilibrium. We identify conditions under which, given an outcome relation, Nash equilibria are guaranteed to exist. We also study the complexity of checking for the existence of Nash equilibria and of verifying if a strategy profile is a Nash equilibrium. We further study the subclass of Boolean observation games with `knowing whether' goal formulas, for which the satisfaction does not depend on the value of variables. We show that each such Boolean observation game corresponds to a Boolean game and vice versa, by a different correspondence, and that both correspondences are precise in terms of existence of Nash equilibria.
This article discusses the problem of determining whether a given point, or set of points, lies within the convex hull of another set of points in $d$ dimensions. This problem arises naturally in a statistical context when using a particular approximation to the loglikelihood function for an exponential family model; in particular, we discuss the application to network models here. While the convex hull question may be solved via a simple linear program, this approach is not well known in the statistical literature. Furthermore, this article details several substantial improvements to the convex hull-testing algorithm currently implemented in the widely used 'ergm' package for network modeling.
Finding approximate Nash equilibria in zero-sum imperfect-information games is challenging when the number of information states is large. Policy Space Response Oracles (PSRO) is a deep reinforcement learning algorithm grounded in game theory that is guaranteed to converge to an approximate Nash equilibrium. However, PSRO requires training a reinforcement learning policy at each iteration, making it too slow for large games. We show through counterexamples and experiments that DCH and Rectified PSRO, two existing approaches to scaling up PSRO, fail to converge even in small games. We introduce Pipeline PSRO (P2SRO), the first scalable general method for finding approximate Nash equilibria in large zero-sum imperfect-information games. P2SRO is able to parallelize PSRO with convergence guarantees by maintaining a hierarchical pipeline of reinforcement learning workers, each training against the policies generated by lower levels in the hierarchy. We show that unlike existing methods, P2SRO converges to an approximate Nash equilibrium, and does so faster as the number of parallel workers increases, across a variety of imperfect information games. We also introduce an open-source environment for Barrage Stratego, a variant of Stratego with an approximate game tree complexity of $10^{50}$. P2SRO is able to achieve state-of-the-art performance on Barrage Stratego and beats all existing bots.
In order to avoid the curse of dimensionality, frequently encountered in Big Data analysis, there was a vast development in the field of linear and nonlinear dimension reduction techniques in recent years. These techniques (sometimes referred to as manifold learning) assume that the scattered input data is lying on a lower dimensional manifold, thus the high dimensionality problem can be overcome by learning the lower dimensionality behavior. However, in real life applications, data is often very noisy. In this work, we propose a method to approximate $\mathcal{M}$ a $d$-dimensional $C^{m+1}$ smooth submanifold of $\mathbb{R}^n$ ($d \ll n$) based upon noisy scattered data points (i.e., a data cloud). We assume that the data points are located "near" the lower dimensional manifold and suggest a non-linear moving least-squares projection on an approximating $d$-dimensional manifold. Under some mild assumptions, the resulting approximant is shown to be infinitely smooth and of high approximation order (i.e., $O(h^{m+1})$, where $h$ is the fill distance and $m$ is the degree of the local polynomial approximation). The method presented here assumes no analytic knowledge of the approximated manifold and the approximation algorithm is linear in the large dimension $n$. Furthermore, the approximating manifold can serve as a framework to perform operations directly on the high dimensional data in a computationally efficient manner. This way, the preparatory step of dimension reduction, which induces distortions to the data, can be avoided altogether.
We consider the task of learning the parameters of a {\em single} component of a mixture model, for the case when we are given {\em side information} about that component, we call this the "search problem" in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain parameter estimates with greater accuracy, and also improved computation complexity than existing moment based mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real data sets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.