蜜芽亚洲精品国产品国语在线试看_欧美成人精品视频一二区三区_一级性爱视频观看_国产无遮挡裸露视频免费无码_国产A及黄色视频_亚洲国产中文成人手机在线观看_色欲AV免费一区二区三区

Game-theoretic interactions with AI agents could differ from traditional human-human interactions in various ways. One such difference is that it may be possible to simulate an AI agent (for example because its source code is known), which allows others to accurately predict the agent's actions. This could lower the bar for trust and cooperation. In this paper, we formalize games in which one player can simulate another at a cost. We first derive some basic properties of such games and then prove a number of results for them, including: (1) introducing simulation into generic-payoff normal-form games makes them easier to solve; (2) if the only obstacle to cooperation is a lack of trust in the possibly-simulated agent, simulation enables equilibria that improve the outcome for both agents; and however (3) there are settings where introducing simulation results in strictly worse outcomes for both players.

相關內容

Agent

關注 15

Better · 數據獲取 · Agent · Machine Learning · Learning ·

2023 年 7 月 6 日

Effects of data time lag in a decision-making system using machine learning for pork price prediction

Mario Suaza-Medina,F. Javier Zarazaga-Soria,Jorge Pinilla-Lopez,Francisco J. López-Pellicer,Javier Lacasta

from arxiv, Published in "Neural Computing and Applications"

Spain is the third-largest producer of pork meat in the world, and many farms in several regions depend on the evolution of this market. However, the current pricing system is unfair, as some actors have better market information than others. In this context, historical pricing is an easy-to-find and affordable data source that can help all agents to be better informed. However, the time lag in data acquisition can affect their pricing decisions. In this paper, we study the effect that data acquisition delay has on a price prediction system using multiple prediction algorithms. We describe the integration of the best proposal into a decision support system prototype and test it in a real-case scenario. Specifically, we use public data from the most important regional pork meat markets in Spain published by the Ministry of Agriculture with a two-week delay and subscription-based data of the same markets obtained on the same day. The results show that the error difference between the best public and data subscription models is 0.6 Euro cents in favor of the data without delay. The market dimension makes these differences significant in the supply chain, giving pricing agents a better tool to negotiate market prices.

INFORMS · 統計量 · 原點 · Projection · 代碼 ·

2023 年 7 月 5 日

Replicability of Simulation Studies for the Investigation of Statistical Methods: The RepliSims Project

K. Luijken,A. Lohmann,U. Alter,J. Claramunt Gonzalez,F. J. Clouth,J. L. Fossum,L. Hesen,A. H. J. Huizing,J. Ketelaar,A. K. Montoya,L. Nab,R. C. C. Nijman,B. B. L. Penning de Vries,T. D. Tibbe,Y. A. Wang,R. H. H. Groenwold

from arxiv, 36 pages, 0 figures

Results of simulation studies evaluating the performance of statistical methods are often considered actionable and thus can have a major impact on the way empirical research is implemented. However, so far there is limited evidence about the reproducibility and replicability of statistical simulation studies. Therefore, eight highly cited statistical simulation studies were selected, and their replicability was assessed by teams of replicators with formal training in quantitative methodology. The teams found relevant information in the original publications and used it to write simulation code with the aim of replicating the results. The primary outcome was the feasibility of replicability based on reported information in the original publications. Replicability varied greatly: Some original studies provided detailed information leading to almost perfect replication of results, whereas other studies did not provide enough information to implement any of the reported simulations. Replicators had to make choices regarding missing or ambiguous information in the original studies, error handling, and software environment. Factors facilitating replication included public availability of code, and descriptions of the data-generating procedure and methods in graphs, formulas, structured text, and publicly accessible additional resources such as technical reports. Replicability of statistical simulation studies was mainly impeded by lack of information and sustainability of information sources. Reproducibility could be achieved for simulation studies by providing open code and data as a supplement to the publication. Additionally, simulation studies should be transparently reported with all relevant information either in the research paper itself or in easily accessible supplementary material to allow for replicability.

Oracle · 情景 · Learning · 概念類 · 在線 ·

2023 年 7 月 4 日

Online Learning and Solving Infinite Games with an ERM Oracle

Angelos Assos,Idan Attias,Yuval Dagan,Constantinos Daskalakis,Maxwell Fishelson

from arxiv, In COLT2023

While ERM suffices to attain near-optimal generalization error in the stochastic learning setting, this is not known to be the case in the online learning setting, where algorithms for general concept classes rely on computationally inefficient oracles such as the Standard Optimal Algorithm (SOA). In this work, we propose an algorithm for online binary classification setting that relies solely on ERM oracle calls, and show that it has finite regret in the realizable setting and sublinearly growing regret in the agnostic setting. We bound the regret in terms of the Littlestone and threshold dimensions of the underlying concept class. We obtain similar results for nonparametric games, where the ERM oracle can be interpreted as a best response oracle, finding the best response of a player to a given history of play of the other players. In this setting, we provide learning algorithms that only rely on best response oracles and converge to approximate-minimax equilibria in two-player zero-sum games and approximate coarse correlated equilibria in multi-player general-sum games, as long as the game has a bounded fat-threshold dimension. Our algorithms apply to both binary-valued and real-valued games and can be viewed as providing justification for the wide use of double oracle and multiple oracle algorithms in the practice of solving large games.

啟發式算法 · 近似 · SAT · MoDELS · 示例 ·

2023 年 7 月 4 日

Heuristic Algorithms for the Approximation of Mutual Coherence

Gregor Betz,Vera Chekan,Tamara Mchedlidze

from arxiv, Results from 2021

Mutual coherence is a measure of similarity between two opinions. Although the notion comes from philosophy, it is essential for a wide range of technologies, e.g., the Wahl-O-Mat system. In Germany, this system helps voters to find candidates that are the closest to their political preferences. The exact computation of mutual coherence is highly time-consuming due to the iteration over all subsets of an opinion. Moreover, for every subset, an instance of the SAT model counting problem has to be solved which is known to be a hard problem in computer science. This work is the first study to accelerate this computation. We model the distribution of the so-called confirmation values as a mixture of three Gaussians and present efficient heuristics to estimate its model parameters. The mutual coherence is then approximated with the expected value of the distribution. Some of the presented algorithms are fully polynomial-time, others only require solving a small number of instances of the SAT model counting problem. The average squared error of our best algorithm lies below 0.0035 which is insignificant if the efficiency is taken into account. Furthermore, the accuracy is precise enough to be used in Wahl-O-Mat-like systems.

噪聲 · INTERACT · 可理解性 · 情景 · 講稿 ·

2023 年 7 月 4 日

Noisy Games: A Study on the Effect of Noise on Game Specifications

Constantinos Varsos,Giorgos Flouris,Marina Bitsaki

We consider misinformation games, i.e., multi-agent interactions where the players are misinformed with regards to the game that they play, essentially having an \emph{incorrect} understanding of the game setting, without being aware of their misinformation. In this paper, we introduce and study a new family of misinformation games, called Noisy games, where misinformation is due to structured (white) noise that affects additively the payoff values of players. We analyse the general properties of Noisy games and derive theoretical formulas related to ``behavioural consistency'', i.e., the probability that the players behaviour will not be significantly affected by the noise. We show several properties of these formulas, and present an experimental evaluation that validates and visualises these results.

估計/估計量 · anchor · INFORMS · Agent · Performer ·

2023 年 6 月 30 日

On the Limits of Single Anchor Localization: Near-Field vs Far-Field

Don-Roberts Emenonye,Harpreet S. Dhillon,R. Michael Buehrer

It is well known that a single anchor can be used to determine the position and orientation of an agent communicating with it. However, it is not clear what information about the anchor or the agent is necessary to perform this localization, especially when the agent is in the near-field of the anchor. Hence, in this paper, to investigate the limits of localizing an agent with some uncertainty in the anchor location, we consider a wireless link consisting of source and destination nodes. More specifically, we present a Fisher information theoretical investigation of the possibility of estimating different combinations of the source and destination's position and orientation from the signal received at the destination. To present a comprehensive study, we perform this Fisher information theoretic investigation under both the near and far field propagation models. One of the key insights is that while the source or destination's $3$D orientation can be jointly estimated with the source or destination's $3$D position in the near-field propagation regime, only the source or destination's $2$D orientation can be jointly estimated with the source or destination's $2$D position in the far-field propagation regime. Also, a simulation of the FIM indicates that in the near-field, we can estimate the source's $3$D orientation angles with no beamforming, but in the far-field, we can not estimate the source's $2$D orientation angles when no beamforming is employed.

Learning · 不完美信息 · Agent · 強化學習 · Self-Play ·

2022 年 6 月 30 日

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

Julien Perolat,Bart de Vylder,Daniel Hennes,Eugene Tarassov,Florian Strub,Vincent de Boer,Paul Muller,Jerome T. Connor,Neil Burch,Thomas Anthony,Stephen McAleer,Romuald Elie,Sarah H. Cen,Zhe Wang,Audrunas Gruslys,Aleksandra Malysheva,Mina Khan,Sherjil Ozair,Finbarr Timbers,Toby Pohlen,Tom Eccles,Mark Rowland,Marc Lanctot,Jean-Baptiste Lespiau,Bilal Piot,Shayegan Omidshafiei,Edward Lockhart,Laurent Sifre,Nathalie Beauguerlange,Remi Munos,David Silver,Satinder Singh,Demis Hassabis,Karl Tuyls

We introduce DeepNash, an autonomous agent capable of learning to play the imperfect information game Stratego from scratch, up to a human expert level. Stratego is one of the few iconic board games that Artificial Intelligence (AI) has not yet mastered. This popular game has an enormous game tree on the order of $10^{535}$ nodes, i.e., $10^{175}$ times larger than that of Go. It has the additional complexity of requiring decision-making under imperfect information, similar to Texas hold'em poker, which has a significantly smaller game tree (on the order of $10^{164}$ nodes). Decisions in Stratego are made over a large number of discrete actions with no obvious link between action and outcome. Episodes are long, with often hundreds of moves before a player wins, and situations in Stratego can not easily be broken down into manageably-sized sub-problems as in poker. For these reasons, Stratego has been a grand challenge for the field of AI for decades, and existing AI methods barely reach an amateur level of play. DeepNash uses a game-theoretic, model-free deep reinforcement learning method, without search, that learns to master Stratego via self-play. The Regularised Nash Dynamics (R-NaD) algorithm, a key component of DeepNash, converges to an approximate Nash equilibrium, instead of 'cycling' around it, by directly modifying the underlying multi-agent learning dynamics. DeepNash beats existing state-of-the-art AI methods in Stratego and achieved a yearly (2022) and all-time top-3 rank on the Gravon games platform, competing with human expert players.

優化器 · INTERACT · Networking · 知識 (knowledge) · Performer ·

2022 年 5 月 11 日

Dynamic neighbourhood optimisation for task allocation using multi-agent

Niall Creech,Natalia Criado Pacheco,Simon Miles

from arxiv, 28 pages

In large-scale systems there are fundamental challenges when centralised techniques are used for task allocation. The number of interactions is limited by resource constraints such as on computation, storage, and network communication. We can increase scalability by implementing the system as a distributed task-allocation system, sharing tasks across many agents. However, this also increases the resource cost of communications and synchronisation, and is difficult to scale. In this paper we present four algorithms to solve these problems. The combination of these algorithms enable each agent to improve their task allocation strategy through reinforcement learning, while changing how much they explore the system in response to how optimal they believe their current strategy is, given their past experience. We focus on distributed agent systems where the agents' behaviours are constrained by resource usage limits, limiting agents to local rather than system-wide knowledge. We evaluate these algorithms in a simulated environment where agents are given a task composed of multiple subtasks that must be allocated to other agents with differing capabilities, to then carry out those tasks. We also simulate real-life system effects such as networking instability. Our solution is shown to solve the task allocation problem to 6.7% of the theoretical optimal within the system configurations considered. It provides 5x better performance recovery over no-knowledge retention approaches when system connectivity is impacted, and is tested against systems up to 100 agents with less than a 9% impact on the algorithms' performance.

圖 · Networking · Processing（編程語言） · 圖卷積 · 圖卷積神經網絡/圖卷積網絡 ·

2021 年 12 月 27 日

Powerful Graph Convolutioal Networks with Adaptive Propagation Mechanism for Homophily and Heterophily

Tao Wang,Rui Wang,Di Jin,Dongxiao He,Yuxiao Huang

Graph Convolutional Networks (GCNs) have been widely applied in various fields due to their significant power on processing graph-structured data. Typical GCN and its variants work under a homophily assumption (i.e., nodes with same class are prone to connect to each other), while ignoring the heterophily which exists in many real-world networks (i.e., nodes with different classes tend to form edges). Existing methods deal with heterophily by mainly aggregating higher-order neighborhoods or combing the immediate representations, which leads to noise and irrelevant information in the result. But these methods did not change the propagation mechanism which works under homophily assumption (that is a fundamental part of GCNs). This makes it difficult to distinguish the representation of nodes from different classes. To address this problem, in this paper we design a novel propagation mechanism, which can automatically change the propagation and aggregation process according to homophily or heterophily between node pairs. To adaptively learn the propagation process, we introduce two measurements of homophily degree between node pairs, which is learned based on topological and attribute information, respectively. Then we incorporate the learnable homophily degree into the graph convolution framework, which is trained in an end-to-end schema, enabling it to go beyond the assumption of homophily. More importantly, we theoretically prove that our model can constrain the similarity of representations between nodes according to their homophily degree. Experiments on seven real-world datasets demonstrate that this new approach outperforms the state-of-the-art methods under heterophily or low homophily, and gains competitive performance under homophily.

Networking · 學成 · Principle · MoDELS · Networks ·

2021 年 6 月 18 日

The Principles of Deep Learning Theory

Daniel A. Roberts,Sho Yaida,Boris Hanin

from arxiv, 451 pages, to be published by Cambridge University Press

This book develops an effective theory approach to understanding deep neural networks of practical relevance. Beginning from a first-principles component-level picture of networks, we explain how to determine an accurate description of the output of trained networks by solving layer-to-layer iteration equations and nonlinear learning dynamics. A main result is that the predictions of networks are described by nearly-Gaussian distributions, with the depth-to-width aspect ratio of the network controlling the deviations from the infinite-width Gaussian description. We explain how these effectively-deep networks learn nontrivial representations from training and more broadly analyze the mechanism of representation learning for nonlinear models. From a nearly-kernel-methods perspective, we find that the dependence of such models' predictions on the underlying learning algorithm can be expressed in a simple and universal way. To obtain these results, we develop the notion of representation group flow (RG flow) to characterize the propagation of signals through the network. By tuning networks to criticality, we give a practical solution to the exploding and vanishing gradient problem. We further explain how RG flow leads to near-universal behavior and lets us categorize networks built from different activation functions into universality classes. Altogether, we show that the depth-to-width ratio governs the effective model complexity of the ensemble of trained networks. By using information-theoretic techniques, we estimate the optimal aspect ratio at which we expect the network to be practically most useful and show how residual connections can be used to push this scale to arbitrary depths. With these tools, we can learn in detail about the inductive bias of architectures, hyperparameters, and optimizers.