成人不卡顿免费视频在线_在线欧美视频一区二区三区_又刺激又舒服又色又爽的视频_成人黄色网站视频色_精品日韩欧美一区二区三区四区_久久精品国产亚洲VA香蕉_午夜暧暧暧视频免费观看

The concept of Nash equilibrium enlightens the structure of rational behavior in multi-agent settings. However, the concept is as helpful as one may compute it efficiently. We introduce the Cut-and-Play, an algorithm to compute Nash equilibria for non-cooperative simultaneous games where each player's objective is linear in their variables and bilinear in the other players' variables. Using the rich theory of integer programming, we alternate between constructing (i.) increasingly tighter outer approximations of the convex hull of each player's feasible set -- by using branching and cutting plane methods -- and (ii.) increasingly better inner approximations of these hulls -- by finding extreme points and rays of the convex hulls. In particular, we prove the correctness of our algorithm when these convex hulls are polyhedra. Our algorithm allows us to leverage the mixed integer programming technology to compute equilibria for a large class of games. Further, we integrate existing cutting plane families inside the algorithm, significantly speeding up equilibria computation. We showcase a set of extensive computational results for Integer Programming Games and simultaneous games among bilevel leaders. In both cases, our framework outperforms the state-of-the-art in computing time and solution quality.

相關內容

近(jin)似

關注 0

近似 · Networking · MoDELS · 情景 · SimPLe ·

2022 年 1 月 14 日

Strong Approximations and Irrationality in Financial Networks with Financial Derivatives

Stavros D. Ioannidis,Bart de Keijzer,Carmine Ventre

Financial networks model a set of financial institutions (firms) interconnected by obligations. Recent work has introduced to this model a class of obligations called credit default swaps, a certain kind of financial derivatives. The main computational challenge for such systems is known as the clearing problem, which is to determine which firms are in default and to compute their exposure to systemic risk, technically known as their recovery rates. It is known that the recovery rates form the set of fixed points of a simple function, and that these fixed points can be irrational. Furthermore, Schuldenzucker et al. (2016) have shown that finding a weakly (or "almost") approximate (rational) fixed point is PPAD-complete. We further study the clearing problem from the point of view of irrationality and approximation strength. Firstly, we observe that weakly approximate solutions may misrepresent the actual financial state of an institution. On this basis, we study the complexity of finding a strongly (or "near") approximate solution, and show FIXP-completeness. We then study the structural properties required for irrationality, and we give necessary conditions for irrational solutions to emerge: The presence of certain types of cycles in a financial network forces the recovery rates to take the form of roots of non-linear polynomials. In the absence of a large subclass of such cycles, we study the complexity of finding an exact fixed point, which we show to be a problem close to, albeit outside of, PPAD.

貝葉斯網/貝葉斯網絡 · 統計量 · 秩 · Microsoft Surface · 線性的 ·

2022 年 1 月 14 日

Geometry of Dependency Equilibria

Irem Portakal,Bernd Sturmfels

from arxiv, 20 pages

An $n$-person game is specified by $n$ tensors of the same format. We view its equilibria as points in that tensor space. Dependency equilibria are defined by linear constraints on conditional probabilities, and thus by determinantal quadrics in the tensor entries. These equations cut out the Spohn variety, named after the philosopher who introduced dependency equilibria. The Nash equilibria among these are the tensors of rank one. We study the real algebraic geometry of the Spohn variety. This variety is rational, except for $2 \times 2$ games, when it is an elliptic curve. For $3 \times 2$ games, it is a del Pezzo surface of degree two. We characterize the payoff regions and their boundaries using oriented matroids, and we develop the connection to Bayesian networks in statistics.

方陣 · 情景 ·

2022 年 1 月 14 日

Windmills of the minds: an algorithm for Fermat's Two Squares Theorem

Hing Lun Chan

from arxiv, 14 pages, 6 tables, 10 figures. In Proceedings of the 11th ACM SIGPLAN International Conference on Certified Programs and Proofs (CPP 2022), January 17-18, 2022, Philadelphia, PA, USA

The two squares theorem of Fermat is a gem in number theory, with a spectacular one-sentence "proof from the Book". Here is a formalisation of this proof, with an interpretation using windmill patterns. The theory behind involves involutions on a finite set, especially the parity of the number of fixed points in the involutions. Starting as an existence proof that is non-constructive, there is an ingenious way to turn it into a constructive one. This gives an algorithm to compute the two squares by iterating the two involutions alternatively from a known fixed point.

GROUP · 卡爾曼濾波 · 狀態估計 · 狀態空間 · 不變 ·

2022 年 1 月 13 日

The Geometry of Navigation Problems

Axel Barrau,Silvere Bonnabel

from arxiv, Accepted in IEEE Transactions on Automatic Control, in press

While many works exploiting an existing Lie group structure have been proposed for state estimation, in particular the Invariant Extended Kalman Filter (IEKF), few papers address the construction of a group structure that allows casting a given system into the IEKF framework, namely making the dynamics group affine and the observations invariant. In this paper we introduce a large class of systems encompassing most problems involving a navigating vehicle encountered in practice. For those systems we introduce a novel methodology that systematically provides a group structure for the state space, including vectors of the body frame such as biases. We use it to derive observers having properties akin to those of linear observers or filters. The proposed unifying and versatile framework encompasses all systems where IEKF has proved successful, improves state-of-the art "imperfect" IEKF for inertial navigation with sensor biases, and allows addressing novel examples, like GNSS antenna lever arm estimation.

優化器 · 查準率/準確率 · 近似 · 離散化 · Continuity ·

2022 年 1 月 13 日

Approximate solutions of convex semi-infinite optimization problems in finitely many iterations

Jochen Schmid,Miltiadis Poursanidis

from arxiv, 24 pages

We develop two adaptive discretization algorithms for convex semi-infinite optimization, which terminate after finitely many iterations at approximate solutions of arbitrary precision. In particular, they terminate at a feasible point of the considered optimization problem. Compared to the existing finitely feasible algorithms for general semi-infinite optimization problems, our algorithms work with considerably smaller discretizations and are thus computationally favorable. Also, our algorithms terminate at approximate solutions of arbitrary precision, while for general semi-infinite optimization problems the best possible approximate-solution precision can be arbitrarily bad. All occurring finite optimization subproblems in our algorithms have to be solved only approximately, and continuity is the only regularity assumption on our objective and constraint functions. Applications to parametric and non-parametric regression problems under shape constraints are discussed.

相關系數 · 經驗頻率 · INFORMS · 查全率/召回率 · 學成 ·

2020 年 6 月 20 日

No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium

Andrea Celli,Alberto Marchesi,Gabriele Farina,Nicola Gatti

The existence of simple, uncoupled no-regret dynamics that converge to correlated equilibria in normal-form games is a celebrated result in the theory of multi-agent systems. Specifically, it has been known for more than 20 years that when all players seek to minimize their internal regret in a repeated normal-form game, the empirical frequency of play converges to a normal-form correlated equilibrium. Extensive-form (that is, tree-form) games generalize normal-form games by modeling both sequential and simultaneous moves, as well as private information. Because of the sequential nature and presence of partial information in the game, extensive-form correlation has significantly different properties than the normal-form counterpart, many of which are still open research directions. Extensive-form correlated equilibrium (EFCE) has been proposed as the natural extensive-form counterpart to normal-form correlated equilibrium. However, it was currently unknown whether EFCE emerges as the result of uncoupled agent dynamics. In this paper, we give the first uncoupled no-regret dynamics that converge to the set of EFCEs in $n$-player general-sum extensive-form games with perfect recall. First, we introduce a notion of trigger regret in extensive-form games, which extends that of internal regret in normal-form games. When each player has low trigger regret, the empirical frequency of play is close to an EFCE. Then, we give an efficient no-trigger-regret algorithm. Our algorithm decomposes trigger regret into local subproblems at each decision point for the player, and constructs a global strategy of the player from the local solutions at each decision point.

近似 · INFORMS · 納什均衡 · 強化學習 · 學成 ·

2020 年 6 月 15 日

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Stephen McAleer,John Lanier,Roy Fox,Pierre Baldi

from arxiv, SM and JL contributed equally

Finding approximate Nash equilibria in zero-sum imperfect-information games is challenging when the number of information states is large. Policy Space Response Oracles (PSRO) is a deep reinforcement learning algorithm grounded in game theory that is guaranteed to converge to an approximate Nash equilibrium. However, PSRO requires training a reinforcement learning policy at each iteration, making it too slow for large games. We show through counterexamples and experiments that DCH and Rectified PSRO, two existing approaches to scaling up PSRO, fail to converge even in small games. We introduce Pipeline PSRO (P2SRO), the first scalable general method for finding approximate Nash equilibria in large zero-sum imperfect-information games. P2SRO is able to parallelize PSRO with convergence guarantees by maintaining a hierarchical pipeline of reinforcement learning workers, each training against the policies generated by lower levels in the hierarchy. We show that unlike existing methods, P2SRO converges to an approximate Nash equilibrium, and does so faster as the number of parallel workers increases, across a variety of imperfect information games. We also introduce an open-source environment for Barrage Stratego, a variant of Stratego with an approximate game tree complexity of $10^{50}$. P2SRO is able to achieve state-of-the-art performance on Barrage Stratego and beats all existing bots.

隱狀態 · 學成 · 強化學習 · INFORMS · 不完美信息 ·

2018 年 3 月 22 日

Modeling Others using Oneself in Multi-Agent Reinforcement Learning

Roberta Raileanu,Emily Denton,Arthur Szlam,Rob Fergus

from arxiv, 10 pages, 16 figures, submitted to ICML 2018

We consider the multi-agent reinforcement learning setting with imperfect information in which each agent is trying to maximize its own utility. The reward function depends on the hidden state (or goal) of both agents, so the agents must infer the other players' hidden goals from their observed behavior in order to solve the tasks. We propose a new approach for learning in these domains: Self Other-Modeling (SOM), in which an agent uses its own policy to predict the other agent's actions and update its belief of their hidden state in an online manner. We evaluate this approach on three different tasks and show that the agents are able to learn better policies using their estimate of the other players' hidden states, in both cooperative and adversarial settings.

秩 · 成對型 · 極大似然 · 排序 · state-of-the-art ·

2018 年 2 月 28 日

SQL-Rank: A Listwise Approach to Collaborative Ranking

Liwei Wu,Cho-Jui Hsieh,James Sharpnack

In this paper, we propose a listwise approach for constructing user-specific rankings in recommendation systems in a collaborative fashion. We contrast the listwise approach to previous pointwise and pairwise approaches, which are based on treating either each rating or each pairwise comparison as an independent instance respectively. By extending the work of (Cao et al. 2007), we cast listwise collaborative ranking as maximum likelihood under a permutation model which applies probability mass to permutations based on a low rank latent score matrix. We present a novel algorithm called SQL-Rank, which can accommodate ties and missing data and can run in linear time. We develop a theoretical framework for analyzing listwise ranking methods based on a novel representation theory for the permutation model. Applying this framework to collaborative ranking, we derive asymptotic statistical rates as the number of users and items grow together. We conclude by demonstrating that our SQL-Rank method often outperforms current state-of-the-art algorithms for implicit feedback such as Weighted-MF and BPR and achieve favorable results when compared to explicit feedback algorithms such as matrix factorization and collaborative ranking.

GANs · 優化器 · GAN · MoDELS · 學成 ·

2018 年 1 月 30 日

Coulomb GANs: Provably Optimal Nash Equilibria via Potential Fields

Thomas Unterthiner,Bernhard Nessler,Calvin Seward,Günter Klambauer,Martin Heusel,Hubert Ramsauer,Sepp Hochreiter

from arxiv, Published as a conference paper at ICLR (International Conference on Learning Representations) 2018. Implementation available at //github.com/bioinf-jku/coulomb_gan

Generative adversarial networks (GANs) evolved into one of the most successful unsupervised techniques for generating realistic images. Even though it has recently been shown that GAN training converges, GAN models often end up in local Nash equilibria that are associated with mode collapse or otherwise fail to model the target distribution. We introduce Coulomb GANs, which pose the GAN learning problem as a potential field of charged particles, where generated samples are attracted to training set samples but repel each other. The discriminator learns a potential field while the generator decreases the energy by moving its samples along the vector (force) field determined by the gradient of the potential field. Through decreasing the energy, the GAN model learns to generate samples according to the whole target distribution and does not only cover some of its modes. We prove that Coulomb GANs possess only one Nash equilibrium which is optimal in the sense that the model distribution equals the target distribution. We show the efficacy of Coulomb GANs on a variety of image datasets. On LSUN and celebA, Coulomb GANs set a new state of the art and produce a previously unseen variety of different samples.