一区二区三区四区五区无码-亚洲国产一区二区精品91

from arxiv, 91 pages with five figures. In monograph format with extended title, with an overview of the economics and mathematics literature of tug-of-war. Minor edits in this version

We study random-turn resource-allocation games. In the Trail of Lost Pennies, a counter moves on $\mathbb{Z}$. At each turn, Maxine stakes $a \in [0,\infty)$ and Mina $b \in [0,\infty)$. The counter $X$ then moves adjacently, to the right with probability $\tfrac{a}{a+b}$. If $X_i \to -\infty$ in this infinte-turn game, Mina receives one unit, and Maxine zero; if $X_i \to \infty$, then these receipts are zero and $x$. Thus the net receipt to a given player is $-A+B$, where $A$ is the sum of her stakes, and $B$ is her terminal receipt. The game was inspired by unbiased tug-of-war in~[PSSW] from 2009 but in fact closely resembles the original version of tug-of-war, introduced [HarrisVickers87] in the economics literature in 1987. We show that the game has surprising features. For a natural class of strategies, Nash equilibria exist precisely when $x$ lies in $[\lambda,\lambda^{-1}]$, for a certain $\lambda \in (0,1)$. We indicate that $\lambda$ is remarkably close to one, proving that $\lambda \leq 0.999904$ and presenting clear numerical evidence that $\lambda \geq 1 - 10^{-4}$. For each $x \in [\lambda,\lambda^{-1}]$, we find countably many Nash equilibria. Each is roughly characterized by an integral {\em battlefield} index: when the counter is nearby, both players stake intensely, with rapid but asymmetric decay in stakes as it moves away. Our results advance premises [HarrisVickers87,Konrad12] for fund management and the incentive-outcome relation that plausibly hold for many player-funded stake-governed games. Alongside a companion treatment [HP22] of games with allocated budgets, we thus offer a detailed mathematical treatment of an illustrative class of tug-of-war games. We also review the separate developments of tug-of-war in economics and mathematics in the hope that mathematicians direct further attention to tug-of-war in its original resource-allocation guise.

相關內容

博弈

關注 14

Prophet · 情景 · 分離的 · Better · HER ·

2023 年 5 月 22 日

Prophet Inequality: Order selection beats random order

Archit Bubna,Ashish Chiplunkar

from arxiv, Accepted in EC 2023

In the prophet inequality problem, a gambler faces a sequence of items arriving online with values drawn independently from known distributions. On seeing an item, the gambler must choose whether to accept its value as her reward and quit the game, or reject it and continue. The gambler's aim is to maximize her expected reward relative to the expected maximum of the values of all items. Since the seventies, a tight bound of 1/2 has been known for this competitive ratio in the setting where the items arrive in an adversarial order (Krengel and Sucheston, 1977, 1978). However, the optimum ratio still remains unknown in the order selection setting, where the gambler selects the arrival order, as well as in prophet secretary, where the items arrive in a random order. Moreover, it is not even known whether a separation exists between the two settings. In this paper, we show that the power of order selection allows the gambler to guarantee a strictly better competitive ratio than if the items arrive randomly. For the order selection setting, we identify an instance for which Peng and Tang's (FOCS'22) state-of-the-art algorithm performs no better than their claimed competitive ratio of (approximately) 0.7251, thus illustrating the need for an improved approach. We therefore extend their design and provide a more general algorithm design framework, using which we show that their ratio can be beaten, by designing a 0.7258-competitive algorithm. For the random order setting, we improve upon Correa, Saona and Ziliotto's (SODA'19) 0.732-hardness result to show a hardness of 0.7254 for general algorithms - even in the setting where the gambler knows the arrival order beforehand, thus establishing a separation between the order selection and random order settings.

層 · GROUP · Performer · 情景 · 變換 ·

2023 年 5 月 22 日

FIT: Far-reaching Interleaved Transformers

Ting Chen,Lala Li

from arxiv, preliminary work (code at //github.com/google-research/pix2seq)

We present FIT: a transformer-based architecture with efficient self-attention and adaptive computation. Unlike original transformers, which operate on a single sequence of data tokens, we divide the data tokens into groups, with each group being a shorter sequence of tokens. We employ two types of transformer layers: local layers operate on data tokens within each group, while global layers operate on a smaller set of introduced latent tokens. These layers, comprising the same set of self-attention and feed-forward layers as standard transformers, are interleaved, and cross-attention is used to facilitate information exchange between data and latent tokens within the same group. The attention complexity is $O(n^2)$ locally within each group of size $n$, but can reach $O(L^{{4}/{3}})$ globally for sequence length of $L$. The efficiency can be further enhanced by relying more on global layers that perform adaptive computation using a smaller set of latent tokens. FIT is a versatile architecture and can function as an encoder, diffusion decoder, or autoregressive decoder. We provide initial evidence demonstrating its effectiveness in high-resolution image understanding and generation tasks. Notably, FIT exhibits potential in performing end-to-end training on gigabit-scale data, such as 6400$\times$6400 images, even without specific optimizations or model parallelism.

Agent · CASE · 近似 · 極大 · 路徑 ·

2023 年 5 月 21 日

The Obnoxious Facility Location Game with Dichotomous Preferences

Fu Li,C. Gregory Plaxton,Vaibhav B. Sinha

from arxiv, 34 pages. This is an extended version of a paper presented at the 22nd Italian Conference on Theoretical Computer Science in September 2021

We consider a facility location game in which $n$ agents reside at known locations on a path, and $k$ heterogeneous facilities are to be constructed on the path. Each agent is adversely affected by some subset of the facilities, and is unaffected by the others. We design two classes of mechanisms for choosing the facility locations given the reported agent preferences: utilitarian mechanisms that strive to maximize social welfare (i.e., to be efficient), and egalitarian mechanisms that strive to maximize the minimum welfare. For the utilitarian objective, we present a weakly group-strategyproof efficient mechanism for up to three facilities, we give strongly group-strategyproof mechanisms that achieve approximation ratios of $5/3$ and $2$ for $k=1$ and $k > 1$, respectively, and we prove that no strongly group-strategyproof mechanism achieves an approximation ratio less than $5/3$ for the case of a single facility. For the egalitarian objective, we present a strategyproof egalitarian mechanism for arbitrary $k$, and we prove that no weakly group-strategyproof mechanism achieves a $o(\sqrt{n})$ approximation ratio for two facilities. We extend our egalitarian results to the case where the agents are located on a cycle, and we extend our first egalitarian result to the case where the agents are located in the unit square.

AI · 控制器 · Learning · Integration · Analysis ·

2023 年 5 月 20 日

Examining the Differential Risk from High-level Artificial Intelligence and the Question of Control

Kyle A. Kilian,Christopher J. Ventura,Mark M. Bailey

from arxiv, 62 pages

Artificial Intelligence (AI) is one of the most transformative technologies of the 21st century. The extent and scope of future AI capabilities remain a key uncertainty, with widespread disagreement on timelines and potential impacts. As nations and technology companies race toward greater complexity and autonomy in AI systems, there are concerns over the extent of integration and oversight of opaque AI decision processes. This is especially true in the subfield of machine learning (ML), where systems learn to optimize objectives without human assistance. Objectives can be imperfectly specified or executed in an unexpected or potentially harmful way. This becomes more concerning as systems increase in power and autonomy, where an abrupt capability jump could result in unexpected shifts in power dynamics or even catastrophic failures. This study presents a hierarchical complex systems framework to model AI risk and provide a template for alternative futures analysis. Survey data were collected from domain experts in the public and private sectors to classify AI impact and likelihood. The results show increased uncertainty over the powerful AI agent scenario, confidence in multiagent environments, and increased concern over AI alignment failures and influence-seeking behavior.

Agent · 極小點 · SimPLe · 秩 · 序列化 ·

2023 年 5 月 20 日

Distortion in metric matching with ordinal preferences

Nima Anari,Moses Charikar,Prasanna Ramakrishnan

from arxiv, to appear at EC'23

Suppose that we have $n$ agents and $n$ items which lie in a shared metric space. We would like to match the agents to items such that the total distance from agents to their matched items is as small as possible. However, instead of having direct access to distances in the metric, we only have each agent's ranking of the items in order of distance. Given this limited information, what is the minimum possible worst-case approximation ratio (known as the distortion) that a matching mechanism can guarantee? Previous work by Caragiannis et al. proved that the (deterministic) Serial Dictatorship mechanism has distortion at most $2^n - 1$. We improve this by providing a simple deterministic mechanism that has distortion $O(n^2)$. We also provide the first nontrivial lower bound on this problem, showing that any matching mechanism (deterministic or randomized) must have worst-case distortion $\Omega(\log n)$. In addition to these new bounds, we show that a large class of truthful mechanisms derived from Deferred Acceptance all have worst-case distortion at least $2^n - 1$, and we find an intriguing connection between thin matchings (analogous to the well-known thin trees conjecture) and the distortion gap between deterministic and randomized mechanisms.

收縮 · Agent · 樣本復雜度 · 在線 · 情景 ·

2023 年 5 月 19 日

The Sample Complexity of Online Contract Design

Banghua Zhu,Stephen Bates,Zhuoran Yang,Yixin Wang,Jiantao Jiao,Michael I. Jordan

We study the hidden-action principal-agent problem in an online setting. In each round, the principal posts a contract that specifies the payment to the agent based on each outcome. The agent then makes a strategic choice of action that maximizes her own utility, but the action is not directly observable by the principal. The principal observes the outcome and receives utility from the agent's choice of action. Based on past observations, the principal dynamically adjusts the contracts with the goal of maximizing her utility. We introduce an online learning algorithm and provide an upper bound on its Stackelberg regret. We show that when the contract space is $[0,1]^m$, the Stackelberg regret is upper bounded by $\widetilde O(\sqrt{m} \cdot T^{1-1/(2m+1)})$, and lower bounded by $\Omega(T^{1-1/(m+2)})$, where $\widetilde O$ omits logarithmic factors. This result shows that exponential-in-$m$ samples are sufficient and necessary to learn a near-optimal contract, resolving an open problem on the hardness of online contract design. Moreover, when contracts are restricted to some subset $\mathcal{F} \subset [0,1]^m$, we define an intrinsic dimension of $\mathcal{F}$ that depends on the covering number of the spherical code in the space and bound the regret in terms of this intrinsic dimension. When $\mathcal{F}$ is the family of linear contracts, we show that the Stackelberg regret grows exactly as $\Theta(T^{2/3})$. The contract design problem is challenging because the utility function is discontinuous. Bounding the discretization error in this setting has been an open problem. In this paper, we identify a limited set of directions in which the utility function is continuous, allowing us to design a new discretization method and bound its error. This approach enables the first upper bound with no restrictions on the contract and action space.

語言模型化 · Automator · GPT-4 · CoT · MoDELS ·

2023 年 5 月 18 日

Generalized Planning in PDDL Domains with Pretrained Large Language Models

Tom Silver,Soham Dan,Kavitha Srinivas,Joshua B. Tenenbaum,Leslie Pack Kaelbling,Michael Katz

Recent work has considered whether large language models (LLMs) can function as planners: given a task, generate a plan. We investigate whether LLMs can serve as generalized planners: given a domain and training tasks, generate a program that efficiently produces plans for other tasks in the domain. In particular, we consider PDDL domains and use GPT-4 to synthesize Python programs. We also consider (1) Chain-of-Thought (CoT) summarization, where the LLM is prompted to summarize the domain and propose a strategy in words before synthesizing the program; and (2) automated debugging, where the program is validated with respect to the training tasks, and in case of errors, the LLM is re-prompted with four types of feedback. We evaluate this approach in seven PDDL domains and compare it to four ablations and four baselines. Overall, we find that GPT-4 is a surprisingly powerful generalized planner. We also conclude that automated debugging is very important, that CoT summarization has non-uniform impact, that GPT-4 is far superior to GPT-3.5, and that just two training tasks are often sufficient for strong generalization.

相互獨立的 · 奇異的 · 分段 · Projection · 優化器 ·

2023 年 5 月 18 日

Supercloseness of the LDG method for a two-dimensional singularly perturbed convection-diffusion problem on Bakhvalov-type mesh

Chunxiao Zhang,Jin Zhang,Wenchao Zheng

from arxiv, 19 pages and 1 figure

In this paper, we focus on analyzing the supercloseness property of a two-dimensional singularly perturbed convection-diffusion problem with exponential boundary layers. The local discontinuous Galerkin (LDG) method with piecewise tensor-product polynomials of degree k is applied to Bakhvalov-type mesh. By developing special two-dimensional local Gauss-Radau projections and establishing a novel interpolation, supercloseness of an optimal order k+1 can be achieved on Bakhvalov-type mesh. It is crucial to highlight that this supercloseness result is independent of the singular perturbation parameter.

中位數 · 可約的 · Performer · 極小點 · 張成子空間 ·

2023 年 5 月 17 日

An extended version of the Ordered Median Tree Location Problem including appendices and detailed computational results

M. A. Pozo Monta?o,Justo Puerto Albandoz,Alberto Torrejón Valenzuela

from arxiv, 112 pages, 4 figures, extended version of 'The Ordered Median Location Problem'

In this paper, we propose the Ordered Median Tree Location Problem (OMT). The OMT is a single-allocation facility location problem where p facilities must be placed on a network connected by a non-directed tree. The objective is to minimize the sum of the ordered weighted averaged allocation costs plus the sum of the costs of connecting the facilities in the tree. We present different MILP formulations for the OMT based on properties of the minimum spanning tree problem and the ordered median optimization. Given that ordered median hub location problems are rather difficult to solve we have improved the OMT solution performance by introducing covering variables in a valid reformulation plus developing two pre-processing phases to reduce the size of this formulations. In addition, we propose a Benders decomposition algorithm to approach the OMT. We establish an empirical comparison between these new formulations and we also provide enhancements that together with a proper formulation allow to solve medium size instances on general random graphs.

Learning · 不完美信息 · Agent · 強化學習 · Self-Play ·

2022 年 6 月 30 日

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

Julien Perolat,Bart de Vylder,Daniel Hennes,Eugene Tarassov,Florian Strub,Vincent de Boer,Paul Muller,Jerome T. Connor,Neil Burch,Thomas Anthony,Stephen McAleer,Romuald Elie,Sarah H. Cen,Zhe Wang,Audrunas Gruslys,Aleksandra Malysheva,Mina Khan,Sherjil Ozair,Finbarr Timbers,Toby Pohlen,Tom Eccles,Mark Rowland,Marc Lanctot,Jean-Baptiste Lespiau,Bilal Piot,Shayegan Omidshafiei,Edward Lockhart,Laurent Sifre,Nathalie Beauguerlange,Remi Munos,David Silver,Satinder Singh,Demis Hassabis,Karl Tuyls

We introduce DeepNash, an autonomous agent capable of learning to play the imperfect information game Stratego from scratch, up to a human expert level. Stratego is one of the few iconic board games that Artificial Intelligence (AI) has not yet mastered. This popular game has an enormous game tree on the order of $10^{535}$ nodes, i.e., $10^{175}$ times larger than that of Go. It has the additional complexity of requiring decision-making under imperfect information, similar to Texas hold'em poker, which has a significantly smaller game tree (on the order of $10^{164}$ nodes). Decisions in Stratego are made over a large number of discrete actions with no obvious link between action and outcome. Episodes are long, with often hundreds of moves before a player wins, and situations in Stratego can not easily be broken down into manageably-sized sub-problems as in poker. For these reasons, Stratego has been a grand challenge for the field of AI for decades, and existing AI methods barely reach an amateur level of play. DeepNash uses a game-theoretic, model-free deep reinforcement learning method, without search, that learns to master Stratego via self-play. The Regularised Nash Dynamics (R-NaD) algorithm, a key component of DeepNash, converges to an approximate Nash equilibrium, instead of 'cycling' around it, by directly modifying the underlying multi-agent learning dynamics. DeepNash beats existing state-of-the-art AI methods in Stratego and achieved a yearly (2022) and all-time top-3 rank on the Gravon games platform, competing with human expert players.