黄片一级在线视频播放,人人爽人人看人人硬,国产深夜激情一区二区,亚洲欧美高清激情精品一区

Systems of competing agents can often be modeled as games. Assuming rationality, the most likely outcomes are given by an equilibrium (e.g. a Nash equilibrium). In many practical settings, games are influenced by context, i.e. additional data beyond the control of any agent (e.g. weather for traffic and fiscal policy for market economies). Often the exact game mechanics are unknown, yet vast amounts of historical data consisting of (context, equilibrium) pairs are available, raising the possibility of learning a solver which predicts the equilibria given only the context. We introduce Nash Fixed Point Networks (N-FPNs), a class of neural networks that naturally output equilibria. Crucially, N- FPNs employ a constraint decoupling scheme to handle complicated agent action sets while avoiding expensive projections. Empirically, we find N-FPNs are compatible with the recently developed Jacobian-Free Backpropagation technique for training implicit networks, making them significantly faster and easier to train than prior models. Our experiments show N-FPNs are capable of scaling to problems orders of magnitude larger than existing learned game solvers.

相關內容

Agent

關注 15

近似 · 模型評估 · Learning · 設計 · 強化學習 ·

2023 年 12 月 29 日

Design Space Exploration of Approximate Computing Techniques with a Reinforcement Learning Approach

Sepide Saeedi,Alessandro Savino,Stefano Di Carlo

Approximate Computing (AxC) techniques have become increasingly popular in trading off accuracy for performance gains in various applications. Selecting the best AxC techniques for a given application is challenging. Among proposed approaches for exploring the design space, Machine Learning approaches such as Reinforcement Learning (RL) show promising results. In this paper, we proposed an RL-based multi-objective Design Space Exploration strategy to find the approximate versions of the application that balance accuracy degradation and power and computation time reduction. Our experimental results show a good trade-off between accuracy degradation and decreased power and computation time for some benchmarks.

state-of-the-art · 樣本 · 線性的 · 博弈論 ·

2023 年 12 月 28 日

Randomness Requirements and Asymmetries in Nash Equilibria

Edan Orzech,Martin Rinard

In general, Nash equilibria in normal-form games may require players to play (probabilistically) mixed strategies. We define a measure of the complexity of finite probability distributions and study the complexity required to play NEs in finite two player $n\times n$ games with rational payoffs. Our central results show that there exist games in which there is an exponential vs. linear gap in the complexity of the mixed distributions that the two players play at (the unique) NE. This gap induces gaps in the amount of space required to represent and sample from the corresponding distributions using known state-of-the-art sampling algorithms. We also establish upper and lower bounds on the complexity of any NE in the games that we study. These results highlight (i) the nontriviality of the assumption that players can any mixed strategy and (ii) the disparities in resources that players may require to play NEs in the games that we study.

Learning · Agent · 全 · 優化器 · INFORMS ·

2023 年 12 月 28 日

Achieving Maximum Utilization in Optimal Time for Learning or Convergence in the Kolkata Paise Restaurant Problem

Aniruddha Biswas,Antika Sinha,Bikas K. Chakrabarti

from arxiv, 9 pages, 6 figures included in manuscript; submitted to Indian Journal of Physics

The objective of the KPR agents are to learn themselves in the minimum (learning) time to have maximum success or utilization probability ($f$). A dictator can easily solve the problem with $f = 1$ in no time, by asking every one to form a queue and go to the respective restaurant, resulting in no fluctuation and full utilization from the first day (convergence time $\tau = 0$). It has already been shown that if each agent chooses randomly the restaurants, $f = 1 - e^{-1} \simeq 0.63$ (where $e \simeq 2.718$ denotes the Euler number) in zero time ($\tau = 0$). With the only available information about yesterday's crowd size in the restaurant visited by the agent (as assumed for the rest of the strategies studied here), the crowd avoiding (CA) strategies can give higher values of $f$ but also of $\tau$. Several numerical studies of modified learning strategies actually indicated increased value of $f = 1 - \alpha$ for $\alpha \to 0$, with $\tau \sim 1/\alpha$. We show here using Monte Carlo technique, a modified Greedy Crowd Avoiding (GCA) Strategy can assure full utilization ($f = 1$) in convergence time $\tau \simeq eN$, with of course non-zero probability for an even larger convergence time. All these observations suggest that the strategies with single step memory of the individuals can never collectively achieve full utilization ($f = 1$) in finite convergence time and perhaps the maximum possible utilization that can be achieved is about eighty percent ($f \simeq 0.80$) in an optimal time $\tau$ of order ten, even when $N$ the number of customers or of the restaurants goes to infinity.

道德化 · 極小點 · Agent · Continuity · MoDELS ·

2023 年 7 月 2 日

Minimum Levels of Interpretability for Artificial Moral Agents

Avish Vijayaraghavan,Cosmin Badea

As artificial intelligence (AI) models continue to scale up, they are becoming more capable and integrated into various forms of decision-making systems. For models involved in moral decision-making, also known as artificial moral agents (AMA), interpretability provides a way to trust and understand the agent's internal reasoning mechanisms for effective use and error correction. In this paper, we provide an overview of this rapidly-evolving sub-field of AI interpretability, introduce the concept of the Minimum Level of Interpretability (MLI) and recommend an MLI for various types of agents, to aid their safe deployment in real-world settings.

MoDELS · AIM · 評論員 · 語言模型化 · 知識 (knowledge) ·

2022 年 12 月 20 日

Towards Reasoning in Large Language Models: A Survey

Jie Huang,Kevin Chen-Chuan Chang

Reasoning is a fundamental aspect of human intelligence that plays a crucial role in activities such as problem solving, decision making, and critical thinking. In recent years, large language models (LLMs) have made significant progress in natural language processing, and there is observation that these models may exhibit reasoning abilities when they are sufficiently large. However, it is not yet clear to what extent LLMs are capable of reasoning. This paper provides a comprehensive overview of the current state of knowledge on reasoning in LLMs, including techniques for improving and eliciting reasoning in these models, methods and benchmarks for evaluating reasoning abilities, findings and implications of previous research in this field, and suggestions on future directions. Our aim is to provide a detailed and up-to-date review of this topic and stimulate meaningful discussion and future work.

MoDELS · 優化器 · Analysis · 推斷 · 估計/估計量 ·

2022 年 9 月 19 日

A Survey of Deep Causal Model

Zongyu Li,Zhenfeng Zhu

The concept of causality plays an important role in human cognition . In the past few decades, causal inference has been well developed in many fields, such as computer science, medicine, economics, and education. With the advancement of deep learning techniques, it has been increasingly used in causal inference against counterfactual data. Typically, deep causal models map the characteristics of covariates to a representation space and then design various objective optimization functions to estimate counterfactual data unbiasedly based on the different optimization methods. This paper focuses on the survey of the deep causal models, and its core contributions are as follows: 1) we provide relevant metrics under multiple treatments and continuous-dose treatment; 2) we incorporate a comprehensive overview of deep causal models from both temporal development and method classification perspectives; 3) we assist a detailed and comprehensive classification and analysis of relevant datasets and source code.

多樣性 · 學成 · state-of-the-art · MoDELS · 張成子空間 ·

2021 年 3 月 14 日

Modelling Behavioural Diversity for Learning in Open-Ended Games

Nicolas Perez Nieves,Yaodong Yang,Oliver Slumbers,David Henry Mguni,Jun Wang

from arxiv, corresponds to <[email protected]>

Promoting behavioural diversity is critical for solving games with non-transitive dynamics where strategic cycles exist, and there is no consistent winner (e.g., Rock-Paper-Scissors). Yet, there is a lack of rigorous treatment for defining diversity and constructing diversity-aware learning dynamics. In this work, we offer a geometric interpretation of behavioural diversity in games and introduce a novel diversity metric based on \emph{determinantal point processes} (DPP). By incorporating the diversity metric into best-response dynamics, we develop \emph{diverse fictitious play} and \emph{diverse policy-space response oracle} for solving normal-form games and open-ended games. We prove the uniqueness of the diverse best response and the convergence of our algorithms on two-player games. Importantly, we show that maximising the DPP-based diversity metric guarantees to enlarge the \emph{gamescape} -- convex polytopes spanned by agents' mixtures of strategies. To validate our diversity-aware solvers, we test on tens of games that show strong non-transitivity. Results suggest that our methods achieve much lower exploitability than state-of-the-art solvers by finding effective and diverse strategies.

Extensibility · GM · MoDELS · 類別 · 多代理人模型 ·

2021 年 2 月 9 日

Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice

Lewis Hammond,James Fox,Tom Everitt,Alessandro Abate,Michael Wooldridge

from arxiv, Accepted to the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS-21)

Multi-agent influence diagrams (MAIDs) are a popular form of graphical model that, for certain classes of games, have been shown to offer key complexity and explainability advantages over traditional extensive form game (EFG) representations. In this paper, we extend previous work on MAIDs by introducing the concept of a MAID subgame, as well as subgame perfect and trembling hand perfect equilibrium refinements. We then prove several equivalence results between MAIDs and EFGs. Finally, we describe an open source implementation for reasoning about MAIDs and computing their equilibria.

視覺問答 · 自動問答 · MoDELS · 可辨認的 · 注意力機制 ·

2018 年 2 月 15 日

Learning to Count Objects in Natural Images for Visual Question Answering

Yan Zhang,Jonathon Hare,Adam Prügel-Bennett

from arxiv, Published in ICLR 2018

Visual Question Answering (VQA) models have struggled with counting objects in natural images so far. We identify a fundamental problem due to soft attention in these models as a cause. To circumvent this problem, we propose a neural network component that allows robust counting from object proposals. Experiments on a toy task show the effectiveness of this component and we obtain state-of-the-art accuracy on the number category of the VQA v2 dataset without negatively affecting other categories, even outperforming ensemble models with our single model. On a difficult balanced pair metric, the component gives a substantial improvement in counting over a strong baseline by 6.6%.

INFORMS · FM · 推薦系統 · CF · 分解的 ·

2018 年 1 月 8 日

Learning with Heterogeneous Side Information Fusion for Recommender Systems

Huan Zhao,Quanming Yao,Yangqiu Song,James Kwok,Dik Lun Lee

from arxiv, 35 pages, 12 figures

Recommender System (RS) is a hot area where artificial intelligence (AI) techniques can be effectively applied to improve performance. Since the well-known Netflix Challenge, collaborative filtering (CF) has become the most popular and effective recommendation method. Despite their success in CF, various AI techniques still have to face the data sparsity and cold start problems. Previous works tried to solve these two problems by utilizing auxiliary information, such as social connections among users and meta-data of items. However, they process different types of information separately, leading to information loss. In this work, we propose to utilize Heterogeneous Information Network (HIN), which is a natural and general representation of different types of data, to enhance CF-based recommending methods. HIN-based recommender systems face two problems: how to represent high-level semantics for recommendation and how to fuse the heterogeneous information to recommend. To address these problems, we propose to applying meta-graph to HIN-based RS and solve the information fusion problem with a "matrix factorization (MF) + factorization machine (FM)" framework. For the "MF" part, we obtain user-item similarity matrices from each meta-graph and adopt low-rank matrix approximation to get latent features for both users and items. For the "FM" part, we propose to apply FM with Group lasso (FMG) on the obtained features to simultaneously predict missing ratings and select useful meta-graphs. Experimental results on two large real-world datasets, i.e., Amazon and Yelp, show that our proposed approach is better than that of the state-of-the-art FM and other HIN-based recommending methods.