五月丁香四月婷婷激情综合_理伦片在线观看理伦片_免费午夜小视频在线观看_国产欧美久久久久久精品四区_欧美一区二区在线_亚洲国产K久久久久精品_亚洲欧美日韩动漫

In repeated games, such as auctions, players typically use learning algorithms to choose their actions. The use of such autonomous learning agents has become widespread on online platforms. In this paper, we explore the impact of players incorporating monetary transfers into their agents' algorithms, aiming to incentivize behavior in their favor. Our focus is on understanding when players have incentives to make use of monetary transfers, how these payments affect learning dynamics, and what the implications are for welfare and its distribution among the players. We propose a simple game-theoretic model to capture such scenarios. Our results on general games show that in a broad class of games, players benefit from letting their learning agents make payments to other learners during the game dynamics, and that in many cases, this kind of behavior improves welfare for all players. Our results on first- and second-price auctions show that in equilibria of the ``payment policy game,'' the agents' dynamics can reach strong collusive outcomes with low revenue for the auctioneer. These results highlight a challenge for mechanism design in systems where automated learning agents can benefit from interacting with their peers outside the boundaries of the mechanism.

相關內容

Agent

關注 16

穩健性 · 多峰值 · 輸出 · MoDELS · AIM ·

2024 年 7 月 12 日

Improving Alignment and Robustness with Circuit Breakers

Andy Zou,Long Phan,Justin Wang,Derek Duenas,Maxwell Lin,Maksym Andriushchenko,Rowan Wang,Zico Kolter,Matt Fredrikson,Dan Hendrycks

from arxiv, Code and models are available at //github.com/GraySwanAI/circuit-breakers

AI systems can take harmful actions and are highly vulnerable to adversarial attacks. We present an approach, inspired by recent advances in representation engineering, that interrupts the models as they respond with harmful outputs with "circuit breakers." Existing techniques aimed at improving alignment, such as refusal training, are often bypassed. Techniques such as adversarial training try to plug these holes by countering specific attacks. As an alternative to refusal training and adversarial training, circuit-breaking directly controls the representations that are responsible for harmful outputs in the first place. Our technique can be applied to both text-only and multimodal language models to prevent the generation of harmful outputs without sacrificing utility -- even in the presence of powerful unseen attacks. Notably, while adversarial robustness in standalone image recognition remains an open challenge, circuit breakers allow the larger multimodal system to reliably withstand image "hijacks" that aim to produce harmful content. Finally, we extend our approach to AI agents, demonstrating considerable reductions in the rate of harmful actions when they are under attack. Our approach represents a significant step forward in the development of reliable safeguards to harmful behavior and adversarial attacks.

MoDELS · 損失 · Boosting（一種模型訓練加速方式） · 可辨認的 · INFORMS ·

2024 年 7 月 11 日

Boosting Adversarial Transferability for Skeleton-based Action Recognition via Exploring the Model Posterior Space

Yunfeng Diao,Baiqi Wu,Ruixuan Zhang,Xun Yang,Meng Wang,He Wang

Skeletal motion plays a pivotal role in human activity recognition (HAR). Recently, attack methods have been proposed to identify the universal vulnerability of skeleton-based HAR(S-HAR). However, the research of adversarial transferability on S-HAR is largely missing. More importantly, existing attacks all struggle in transfer across unknown S-HAR models. We observed that the key reason is that the loss landscape of the action recognizers is rugged and sharp. Given the established correlation in prior studies~\cite{qin2022boosting,wu2020towards} between loss landscape and adversarial transferability, we assume and empirically validate that smoothing the loss landscape could potentially improve adversarial transferability on S-HAR. This is achieved by proposing a new post-train Dual Bayesian strategy, which can effectively explore the model posterior space for a collection of surrogates without the need for re-training. Furthermore, to craft adversarial examples along the motion manifold, we incorporate the attack gradient with information of the motion dynamics in a Bayesian manner. Evaluated on benchmark datasets, e.g. HDM05 and NTU 60, the average transfer success rate can reach as high as 35.9\% and 45.5\% respectively. In comparison, current state-of-the-art skeletal attacks achieve only 3.6\% and 9.8\%. The high adversarial transferability remains consistent across various surrogate, victim, and even defense models. Through a comprehensive analysis of the results, we provide insights on what surrogates are more likely to exhibit transferability, to shed light on future research.

Oracle · AI · 情景 · Analysis · MoDELS ·

2024 年 7 月 10 日

Playing Large Games with Oracles and AI Debate

Xinyi Chen,Angelica Chen,Dean Foster,Elad Hazan

We consider regret minimization in repeated games with a very large number of actions. Such games are inherent in the setting of AI Safety via Debate \cite{irving2018ai}, and more generally games whose actions are language-based. Existing algorithms for online game playing require per-iteration computation polynomial in the number of actions, which can be prohibitive for large games. We thus consider oracle-based algorithms, as oracles naturally model access to AI agents. With oracle access, we characterize when internal and external regret can be minimized efficiently. We give a novel efficient algorithm for simultaneous external and internal regret minimization whose regret depends logarithmically on the number of actions. We conclude with experiments in the setting of AI Safety via Debate that shows the benefit of insights from our algorithmic analysis.

Extensibility · Analysis · 大語言模型 · 語言模型化 · Claude ·

2024 年 7 月 10 日

Evaluating Large Language Models with Grid-Based Game Competitions: An Extensible LLM Benchmark and Leaderboard

Oguzhan Topsakal,Colby Jacob Edell,Jackson Bailey Harper

We introduce a novel and extensible benchmark for large language models (LLMs) through grid-based games such as Tic-Tac-Toe, Connect-Four, and Gomoku. The open-source game simulation code, available on GitHub, allows LLMs to compete and generates detailed data files in JSON, CSV, TXT, and PNG formats for leaderboard rankings and further analysis. We present the results of games among leading LLMs, including Claude 3.5 Sonnet and Claude 3 Sonnet by Anthropic, Gemini 1.5 Pro and Gemini 1.5 Flash by Google, GPT-4 Turbo and GPT-4o by OpenAI, and Llama3-70B by Meta. We also encourage submissions of results from other LLMs. In total, we simulated 2,310 matches (5 sessions for each pair among 7 LLMs and a random player) across three types of games, using three distinct prompt types: list, illustration, and image. The results revealed significant variations in LLM performance across different games and prompt types, with analysis covering win and disqualification rates, missed opportunity analysis, and invalid move analysis. The details of the leaderboard and result matrix data are available as open-access data on GitHub. This study enhances our understanding of LLMs' capabilities in playing games they were not specifically trained for, helping to assess their rule comprehension and strategic thinking. On the path to Artificial General Intelligence (AGI), this study lays the groundwork for future exploration into their utility in complex decision-making scenarios, illuminating their strategic thinking abilities and offering directions for further inquiry into the limits of LLMs within game-based frameworks.

分解的 · TEAM · AI · AIM · HCI ·

2024 年 7 月 10 日

The Human Factor in AI Red Teaming: Perspectives from Social and Collaborative Computing

Alice Qian Zhang,Ryland Shaw,Jacy Reese Anthis,Ashlee Milton,Emily Tseng,Jina Suh,Lama Ahmad,Ram Shankar Siva Kumar,Julian Posada,Benjamin Shestakofsky,Sarah T. Roberts,Mary L. Gray

from arxiv, Workshop proposal accepted to CSCW 2024

Rapid progress in general-purpose AI has sparked significant interest in "red teaming," a practice of adversarial testing originating in military and cybersecurity applications. AI red teaming raises many questions about the human factor, such as how red teamers are selected, biases and blindspots in how tests are conducted, and harmful content's psychological effects on red teamers. A growing body of HCI and CSCW literature examines related practices-including data labeling, content moderation, and algorithmic auditing. However, few, if any, have investigated red teaming itself. This workshop seeks to consider the conceptual and empirical challenges associated with this practice, often rendered opaque by non-disclosure agreements. Future studies may explore topics ranging from fairness to mental health and other areas of potential harm. We aim to facilitate a community of researchers and practitioners who can begin to meet these challenges with creativity, innovation, and thoughtful reflection.

MoDELS · Pivotal（公司） · 通用智能 · 語言模型化 · 多峰值 ·

2024 年 1 月 25 日

A Survey of Reasoning with Foundation Models

Jiankai Sun,Chuanyang Zheng,Enze Xie,Zhengying Liu,Ruihang Chu,Jianing Qiu,Jiaqi Xu,Mingyu Ding,Hongyang Li,Mengzhe Geng,Yue Wu,Wenhai Wang,Junsong Chen,Zhangyue Yin,Xiaozhe Ren,Jie Fu,Junxian He,Wu Yuan,Qi Liu,Xihui Liu,Yu Li,Hao Dong,Yu Cheng,Ming Zhang,Pheng Ann Heng,Jifeng Dai,Ping Luo,Jingdong Wang,Ji-Rong Wen,Xipeng Qiu,Yike Guo,Hui Xiong,Qun Liu,Zhenguo Li

from arxiv, 20 Figures, 160 Pages, 750+ References, Project Page //github.com/reasoning-survey/Awesome-Reasoning-Foundation-Models

Reasoning, a crucial ability for complex problem-solving, plays a pivotal role in various real-world settings such as negotiation, medical diagnosis, and criminal investigation. It serves as a fundamental methodology in the field of Artificial General Intelligence (AGI). With the ongoing development of foundation models, e.g., Large Language Models (LLMs), there is a growing interest in exploring their abilities in reasoning tasks. In this paper, we introduce seminal foundation models proposed or adaptable for reasoning, highlighting the latest advancements in various reasoning tasks, methods, and benchmarks. We then delve into the potential future directions behind the emergence of reasoning abilities within foundation models. We also discuss the relevance of multimodal learning, autonomous agents, and super alignment in the context of reasoning. By discussing these future research directions, we hope to inspire researchers in their exploration of this field, stimulate further advancements in reasoning with foundation models, and contribute to the development of AGI.

MoDELS · AIM · 評論員 · 語言模型化 · 知識 (knowledge) ·

2022 年 12 月 20 日

Towards Reasoning in Large Language Models: A Survey

Jie Huang,Kevin Chen-Chuan Chang

Reasoning is a fundamental aspect of human intelligence that plays a crucial role in activities such as problem solving, decision making, and critical thinking. In recent years, large language models (LLMs) have made significant progress in natural language processing, and there is observation that these models may exhibit reasoning abilities when they are sufficiently large. However, it is not yet clear to what extent LLMs are capable of reasoning. This paper provides a comprehensive overview of the current state of knowledge on reasoning in LLMs, including techniques for improving and eliciting reasoning in these models, methods and benchmarks for evaluating reasoning abilities, findings and implications of previous research in this field, and suggestions on future directions. Our aim is to provide a detailed and up-to-date review of this topic and stimulate meaningful discussion and future work.

MoDELS · 優化器 · Analysis · 推斷 · 估計/估計量 ·

2022 年 9 月 19 日

A Survey of Deep Causal Model

Zongyu Li,Zhenfeng Zhu

The concept of causality plays an important role in human cognition . In the past few decades, causal inference has been well developed in many fields, such as computer science, medicine, economics, and education. With the advancement of deep learning techniques, it has been increasingly used in causal inference against counterfactual data. Typically, deep causal models map the characteristics of covariates to a representation space and then design various objective optimization functions to estimate counterfactual data unbiasedly based on the different optimization methods. This paper focuses on the survey of the deep causal models, and its core contributions are as follows: 1) we provide relevant metrics under multiple treatments and continuous-dose treatment; 2) we incorporate a comprehensive overview of deep causal models from both temporal development and method classification perspectives; 3) we assist a detailed and comprehensive classification and analysis of relevant datasets and source code.

Extensibility · GM · MoDELS · 類別 · 多代理人模型 ·

2021 年 2 月 9 日

Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice

Lewis Hammond,James Fox,Tom Everitt,Alessandro Abate,Michael Wooldridge

from arxiv, Accepted to the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS-21)

Multi-agent influence diagrams (MAIDs) are a popular form of graphical model that, for certain classes of games, have been shown to offer key complexity and explainability advantages over traditional extensive form game (EFG) representations. In this paper, we extend previous work on MAIDs by introducing the concept of a MAID subgame, as well as subgame perfect and trembling hand perfect equilibrium refinements. We then prove several equivalence results between MAIDs and EFGs. Finally, we describe an open source implementation for reasoning about MAIDs and computing their equilibria.

Machine Learning · 學成 · 可辨認的 · MoDELS · 可理解性 ·

2020 年 10 月 20 日

Counterfactual Explanations for Machine Learning: A Review

Sahil Verma,John Dickerson,Keegan Hines

from arxiv, 10 pages

Machine learning plays a role in many deployed decision systems, often in ways that are difficult or impossible to understand by human stakeholders. Explaining, in a human-understandable way, the relationship between the input and output of machine learning models is essential to the development of trustworthy machine-learning-based systems. A burgeoning body of research seeks to define the goals and methods of explainability in machine learning. In this paper, we seek to review and categorize research on counterfactual explanations, a specific class of explanation that provides a link between what could have happened had input to a model been changed in a particular way. Modern approaches to counterfactual explainability in machine learning draw connections to the established legal doctrine in many countries, making them appealing to fielded systems in high-impact areas such as finance and healthcare. Thus, we design a rubric with desirable properties of counterfactual explanation algorithms and comprehensively evaluate all currently-proposed algorithms against that rubric. Our rubric provides easy comparison and comprehension of the advantages and disadvantages of different approaches and serves as an introduction to major research themes in this field. We also identify gaps and discuss promising research directions in the space of counterfactual explainability.