国产乱理伦片A级在线看-99久久久无码国产精品69

Reaching consensus is key to multi-agent coordination. To accomplish a cooperative task, agents need to coherently select optimal joint actions to maximize the team reward. However, current cooperative multi-agent reinforcement learning (MARL) methods usually do not explicitly take consensus into consideration, which may cause miscoordination problem. In this paper, we propose a model-based consensus mechanism to explicitly coordinate multiple agents. The proposed Multi-agent Goal Imagination (MAGI) framework guides agents to reach consensus with an Imagined common goal. The common goal is an achievable state with high value, which is obtained by sampling from the distribution of future states. We directly model this distribution with a self-supervised generative model, thus alleviating the "curse of dimensinality" problem induced by multi-agent multi-step policy rollout commonly used in model-based methods. We show that such efficient consensus mechanism can guide all agents cooperatively reaching valuable future states. Results on Multi-agent Particle-Environments and Google Research Football environment demonstrate the superiority of MAGI in both sample efficiency and performance.

相關內容

Agent

關注 15

約束 · 單純形 · Learning · ASSETS · 優化器 ·

2024 年 4 月 16 日

Simplex Decomposition for Portfolio Allocation Constraints in Reinforcement Learning

David Winkel,Niklas Strau?,Matthias Schubert,Thomas Seidl

Portfolio optimization tasks describe sequential decision problems in which the investor's wealth is distributed across a set of assets. Allocation constraints are used to enforce minimal or maximal investments into particular subsets of assets to control for objectives such as limiting the portfolio's exposure to a certain sector due to environmental concerns. Although methods for constrained Reinforcement Learning (CRL) can optimize policies while considering allocation constraints, it can be observed that these general methods yield suboptimal results. In this paper, we propose a novel approach to handle allocation constraints based on a decomposition of the constraint action space into a set of unconstrained allocation problems. In particular, we examine this approach for the case of two constraints. For example, an investor may wish to invest at least a certain percentage of the portfolio into green technologies while limiting the investment in the fossil energy sector. We show that the action space of the task is equivalent to the decomposed action space, and introduce a new reinforcement learning (RL) approach CAOSD, which is built on top of the decomposition. The experimental evaluation on real-world Nasdaq-100 data demonstrates that our approach consistently outperforms state-of-the-art CRL benchmarks for portfolio optimization.

Microsoft Surface · 優化器 · 機器人 · binary · Performer ·

2024 年 4 月 12 日

Collective Bayesian Decision-Making in a Swarm of Miniaturized Robots for Surface Inspection

Thiemen Siemensma,Darren Chiu,Sneha Ramshanker,Radhika Nagpal,Bahar Haghighat

Robot swarms can effectively serve a variety of sensing and inspection applications. Certain inspection tasks require a binary classification decision. This work presents an experimental setup for a surface inspection task based on vibration sensing and studies a Bayesian two-outcome decision-making algorithm in a swarm of miniaturized wheeled robots. The robots are tasked with individually inspecting and collectively classifying a 1mx1m tiled surface consisting of vibrating and non-vibrating tiles based on the majority type of tiles. The robots sense vibrations using onboard IMUs and perform collision avoidance using a set of IR sensors. We develop a simulation and optimization framework leveraging the Webots robotic simulator and a Particle Swarm Optimization (PSO) method. We consider two existing information sharing strategies and propose a new one that allows the swarm to rapidly reach accurate classification decisions. We first find optimal parameters that allow efficient sampling in simulation and then evaluate our proposed strategy against the two existing ones using 100 randomized simulation and 10 real experiments. We find that our proposed method compels the swarm to make decisions at an accelerated rate, with an improvement of up to 20.52% in mean decision time at only 0.78% loss in accuracy.

Agent · 路徑 · state-of-the-art · 極小點 · 優化器 ·

2024 年 4 月 11 日

Introducing Delays in Multi-Agent Path Finding

Justin Kottinger,Tzvika Geft,Shaull Almagor,Oren Salzman,Morteza Lahijanian

from arxiv, 9 pages, 6 figures, and 3 tables

We consider a Multi-Agent Path Finding (MAPF) setting where agents have been assigned a plan, but during its execution some agents are delayed. Instead of replanning from scratch when such a delay occurs, we propose delay introduction, whereby we delay some additional agents so that the remainder of the plan can be executed safely. We show that finding the minimum number of additional delays is APX-Hard, i.e., it is NP-Hard to find a $(1+\varepsilon)$-approximation for some $\varepsilon>0$. However, in practice we can find optimal delay-introductions using Conflict-Based Search for very large numbers of agents, and both planning time and the resulting length of the plan are comparable, and sometimes outperform the state-of-the-art heuristics for replanning.

Networking · Agent · 同質 · Networks · 類別 ·

2024 年 4 月 10 日

Altruism Improves Congestion in Series-Parallel Nonatomic Congestion Games

Colton Hill,Philip N. Brown

from arxiv, 7 pages, 2 figures

Self-interested routing polices from individual users in a system can collectively lead to poor aggregate congestion in routing networks. The introduction of altruistic agents, whose goal is to benefit other agents in the system, can seemingly improve aggregate congestion. However, it is known in that in some network routing problems, altruistic agents can actually worsen congestion compared to that which would arise in the presence of a homogeneously selfish population. This paper provides a thorough investigation into the necessary conditions for altruists to be guaranteed to improve total congestion. In particular, we study the class of series-parallel non-atomic congestion games, where one sub-population is altruistic and the other is selfish. We find that a game is guaranteed to have improved congestion in the presence of altruistic agents (even if only a small part of the total population) compared to the homogeneously selfish version of the game, provided the network is symmetric, where all agents are given access to all paths in the network, and the series-parallel network for the game does not have sub-networks which emulate Braess's paradox -- a phenomenon we refer to as a Braess-resistant network. Our results appear to be the most complete characterization of when behavior that is designed to improve total congestion (which we refer to as altruism) is actually guaranteed to do so.

語言模型化 · Agent · 大語言模型 · 可理解性 · MoDELS ·

2024 年 4 月 6 日

Challenges Faced by Large Language Models in Solving Multi-Agent Flocking

Peihan Li,Vishnu Menon,Bhavanaraj Gudiguntla,Daniel Ting,Lifeng Zhou

Flocking is a behavior where multiple agents in a system attempt to stay close to each other while avoiding collision and maintaining a desired formation. This is observed in the natural world and has applications in robotics, including natural disaster search and rescue, wild animal tracking, and perimeter surveillance and patrol. Recently, large language models (LLMs) have displayed an impressive ability to solve various collaboration tasks as individual decision-makers. Solving multi-agent flocking with LLMs would demonstrate their usefulness in situations requiring spatial and decentralized decision-making. Yet, when LLM-powered agents are tasked with implementing multi-agent flocking, they fall short of the desired behavior. After extensive testing, we find that agents with LLMs as individual decision-makers typically opt to converge on the average of their initial positions or diverge from each other. After breaking the problem down, we discover that LLMs cannot understand maintaining a shape or keeping a distance in a meaningful way. Solving multi-agent flocking with LLMs would enhance their ability to understand collaborative spatial reasoning and lay a foundation for addressing more complex multi-agent tasks. This paper discusses the challenges LLMs face in multi-agent flocking and suggests areas for future improvement and research.

優化器 · Learning · 可辨認的 · Agent · 相互獨立的 ·

2024 年 4 月 5 日

ROMA-iQSS: An Objective Alignment Approach via State-Based Value Learning and ROund-Robin Multi-Agent Scheduling

Chi-Hui Lin,Joewie J. Koh,Alessandro Roncone,Lijun Chen

from arxiv, 10 pages, 3 figures, extended version of our 2024 American Control Conference publication

Effective multi-agent collaboration is imperative for solving complex, distributed problems. In this context, two key challenges must be addressed: first, autonomously identifying optimal objectives for collective outcomes; second, aligning these objectives among agents. Traditional frameworks, often reliant on centralized learning, struggle with scalability and efficiency in large multi-agent systems. To overcome these issues, we introduce a decentralized state-based value learning algorithm that enables agents to independently discover optimal states. Furthermore, we introduce a novel mechanism for multi-agent interaction, wherein less proficient agents follow and adopt policies from more experienced ones, thereby indirectly guiding their learning process. Our theoretical analysis shows that our approach leads decentralized agents to an optimal collective policy. Empirical experiments further demonstrate that our method outperforms existing decentralized state-based and action-based value learning strategies by effectively identifying and aligning optimal objectives.

Learning · 路徑 · 優化器 · GROUP · Extensibility ·

2024 年 4 月 4 日

No Panacea in Planning: Algorithm Selection for Suboptimal Multi-Agent Path Finding

Weizhe Chen,Zhihan Wang,Jiaoyang Li,Sven Koenig,Bistra Dilkina

Since more and more algorithms are proposed for multi-agent path finding (MAPF) and each of them has its strengths, choosing the correct one for a specific scenario that fulfills some specified requirements is an important task. Previous research in algorithm selection for MAPF built a standard workflow and showed that machine learning can help. In this paper, we study general solvers for MAPF, which further include suboptimal algorithms. We propose different groups of optimization objectives and learning tasks to handle the new tradeoff between runtime and solution quality. We conduct extensive experiments to show that the same loss can not be used for different groups of optimization objectives, and that standard computer vision models are no worse than customized architecture. We also provide insightful discussions on how feature-sensitive pre-processing is needed for learning for MAPF, and how different learning metrics are correlated to different learning tasks.

動力系統 · 極小點 · 離散化 · Networking · 可辨認的 ·

2024 年 3 月 29 日

Finding Nontrivial Minimum Fixed Points in Discrete Dynamical Systems

Zirou Qiu,Chen Chen,Madhav V. Marathe,S. S. Ravi,Daniel J. Rosenkrantz,Richard E. Stearns,Anil Vullikanti

from arxiv, Accepted at AAAI-22

Networked discrete dynamical systems are often used to model the spread of contagions and decision-making by agents in coordination games. Fixed points of such dynamical systems represent configurations to which the system converges. In the dissemination of undesirable contagions (such as rumors and misinformation), convergence to fixed points with a small number of affected nodes is a desirable goal. Motivated by such considerations, we formulate a novel optimization problem of finding a nontrivial fixed point of the system with the minimum number of affected nodes. We establish that, unless P = NP, there is no polynomial time algorithm for approximating a solution to this problem to within the factor n^1-\epsilon for any constant epsilon > 0. To cope with this computational intractability, we identify several special cases for which the problem can be solved efficiently. Further, we introduce an integer linear program to address the problem for networks of reasonable sizes. For solving the problem on larger networks, we propose a general heuristic framework along with greedy selection methods. Extensive experimental results on real-world networks demonstrate the effectiveness of the proposed heuristics.

Agent · 語言模型化 · MoDELS · 知識 (knowledge) · Continuity ·

2023 年 8 月 22 日

A Survey on Large Language Model based Autonomous Agents

Lei Wang,Chen Ma,Xueyang Feng,Zeyu Zhang,Hao Yang,Jingsen Zhang,Zhiyuan Chen,Jiakai Tang,Xu Chen,Yankai Lin,Wayne Xin Zhao,Zhewei Wei,Ji-Rong Wen

from arxiv, 32 pages, 3 figures

Autonomous agents have long been a prominent research topic in the academic community. Previous research in this field often focuses on training agents with limited knowledge within isolated environments, which diverges significantly from the human learning processes, and thus makes the agents hard to achieve human-like decisions. Recently, through the acquisition of vast amounts of web knowledge, large language models (LLMs) have demonstrated remarkable potential in achieving human-level intelligence. This has sparked an upsurge in studies investigating autonomous agents based on LLMs. To harness the full potential of LLMs, researchers have devised diverse agent architectures tailored to different applications. In this paper, we present a comprehensive survey of these studies, delivering a systematic review of the field of autonomous agents from a holistic perspective. More specifically, our focus lies in the construction of LLM-based agents, for which we propose a unified framework that encompasses a majority of the previous work. Additionally, we provide a summary of the various applications of LLM-based AI agents in the domains of social science, natural science, and engineering. Lastly, we discuss the commonly employed evaluation strategies for LLM-based AI agents. Based on the previous studies, we also present several challenges and future directions in this field. To keep track of this field and continuously update our survey, we maintain a repository for the related references at //github.com/Paitesanshi/LLM-Agent-Survey.

知識 (knowledge) · Machine Learning · MoDELS · 學成 · Conformer ·

2022 年 5 月 10 日

Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

Julian W?rmann,Daniel Bogdoll,Etienne Bührle,Han Chen,Evaristus Fuh Chuo,Kostadin Cvejoski,Ludger van Elst,Tobias Glei?ner,Philip Gottschall,Stefan Griesche,Christian Hellert,Christian Hesels,Sebastian Houben,Tim Joseph,Niklas Keil,Johann Kelsch,Hendrik K?nigshof,Erwin Kraft,Leonie Kreuser,Kevin Krone,Tobias Latka,Denny Mattern,Stefan Matthes,Mohsin Munir,Moritz Nekolla,Adrian Paschke,Maximilian Alexander Pintz,Tianming Qiu,Faraz Qureishi,Syed Tahseen Raza Rizvi,J?rg Reichardt,Laura von Rueden,Stefan Rudolph,Alexander Sagel,Gerhard Schunk,Hao Shen,Hendrik Stapelbroek,Vera Stehr,Gurucharan Srinivas,Anh Tuan Tran,Abhishek Vivekanandan,Ya Wang,Florian Wasserrab,Tino Werner,Christian Wirth,Stefan Zwicklbauer

from arxiv, 93 pages

The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving.