夏娃韩剧电视剧在剧免费韩剧TV,欧美日韩一区二区视频在线观看,好吊视频一区二区三区国产

In recent years, empirical game-theoretic analysis (EGTA) has emerged as a powerful tool for analyzing games in which an exact specification of the utilities is unavailable. Instead, EGTA assumes access to an oracle, i.e., a simulator, which can generate unbiased noisy samples of players' unknown utilities, given a strategy profile. Utilities can thus be empirically estimated by repeatedly querying the simulator. Recently, various progressive sampling (PS) algorithms have been proposed, which aim to produce PAC-style learning guarantees (e.g., approximate Nash equilibria with high probability) using as few simulator queries as possible. One recent work introduces a pruning technique called regret-pruning which further minimizes the number of simulator queries placed in PS algorithms which aim to learn pure Nash equilibria. In this paper, we address a serious limitation of this original regret pruning approach -- it is only able to guarantee that true pure Nash equilibria of the empirical game are approximate equilibria of the true game, and is unable to provide any strong guarantees regarding the efficacy of approximate pure Nash equilibria. This is a significant limitation since in many games, pure Nash equilibria are computationally intractable to find, or even non-existent. We introduce three novel regret pruning variations. The first two variations generalize the original regret pruning approach to yield guarantees for approximate pure Nash equilibria of the empirical game. The third variation goes further to even yield strong guarantees for all approximate mixed Nash equilibria of the empirical game. We use these regret pruning variations to design two novel progressive sampling algorithms, PS-REG+ and PS-REG-M, which experimentally outperform the previous state-of-the-art algorithms for learning pure and mixed equilibria, respectively, of simulation-based games.

相關內容

剪枝

關注 2

Networking · Neural Networks · 可約的 · 神經元 · 近似誤差 ·

2023 年 2 月 1 日

SPIDE: A Purely Spike-based Method for Training Feedback Spiking Neural Networks

Mingqing Xiao,Qingyan Meng,Zongpeng Zhang,Yisen Wang,Zhouchen Lin

from arxiv, Accepted by Neural Networks

Spiking neural networks (SNNs) with event-based computation are promising brain-inspired models for energy-efficient applications on neuromorphic hardware. However, most supervised SNN training methods, such as conversion from artificial neural networks or direct training with surrogate gradients, require complex computation rather than spike-based operations of spiking neurons during training. In this paper, we study spike-based implicit differentiation on the equilibrium state (SPIDE) that extends the recently proposed training method, implicit differentiation on the equilibrium state (IDE), for supervised learning with purely spike-based computation, which demonstrates the potential for energy-efficient training of SNNs. Specifically, we introduce ternary spiking neuron couples and prove that implicit differentiation can be solved by spikes based on this design, so the whole training procedure, including both forward and backward passes, is made as event-driven spike computation, and weights are updated locally with two-stage average firing rates. Then we propose to modify the reset membrane potential to reduce the approximation error of spikes. With these key components, we can train SNNs with flexible structures in a small number of time steps and with firing sparsity during training, and the theoretical estimation of energy costs demonstrates the potential for high efficiency. Meanwhile, experiments show that even with these constraints, our trained models can still achieve competitive results on MNIST, CIFAR-10, CIFAR-100, and CIFAR10-DVS. Our code is available at //github.com/pkuxmq/SPIDE-FSNN.

賭博機/老虎機 · Learning · 情景 · Principle · Agent ·

2023 年 1 月 31 日

Learning Equilibria in Matching Markets from Bandit Feedback

Meena Jagadeesan,Alexander Wei,Yixin Wang,Michael I. Jordan,Jacob Steinhardt

from arxiv, Accepted to the Journal of the ACM; conference version appeared at NeurIPS 2021

Large-scale, two-sided matching platforms must find market outcomes that align with user preferences while simultaneously learning these preferences from data. Classical notions of stability (Gale and Shapley, 1962; Shapley and Shubik, 1971) are unfortunately of limited value in the learning setting, given that preferences are inherently uncertain and destabilizing while they are being learned. To bridge this gap, we develop a framework and algorithms for learning stable market outcomes under uncertainty. Our primary setting is matching with transferable utilities, where the platform both matches agents and sets monetary transfers between them. We design an incentive-aware learning objective that captures the distance of a market outcome from equilibrium. Using this objective, we analyze the complexity of learning as a function of preference structure, casting learning as a stochastic multi-armed bandit problem. Algorithmically, we show that "optimism in the face of uncertainty," the principle underlying many bandit algorithms, applies to a primal-dual formulation of matching with transfers and leads to near-optimal regret bounds. Our work takes a first step toward elucidating when and how stable matchings arise in large, data-driven marketplaces.

Boosting（一種模型訓練加速方式） · 方陣 · Learning · 類別 · Performer ·

2023 年 1 月 31 日

Multicalibration as Boosting for Regression

Ira Globus-Harris,Declan Harrison,Michael Kearns,Aaron Roth,Jessica Sorrell

from arxiv, Code available here: //github.com/Declancharrison/Level-Set-Boosting

We study the connection between multicalibration and boosting for squared error regression. First we prove a useful characterization of multicalibration in terms of a ``swap regret'' like condition on squared error. Using this characterization, we give an exceedingly simple algorithm that can be analyzed both as a boosting algorithm for regression and as a multicalibration algorithm for a class H that makes use only of a standard squared error regression oracle for H. We give a weak learning assumption on H that ensures convergence to Bayes optimality without the need to make any realizability assumptions -- giving us an agnostic boosting algorithm for regression. We then show that our weak learning assumption on H is both necessary and sufficient for multicalibration with respect to H to imply Bayes optimality. We also show that if H satisfies our weak learning condition relative to another class C then multicalibration with respect to H implies multicalibration with respect to C. Finally we investigate the empirical performance of our algorithm experimentally using an open source implementation that we make available. Our code repository can be found at //github.com/Declancharrison/Level-Set-Boosting.

Facebook AI Research · 近似 · SimPLe · Agent · 泛函 ·

2023 年 1 月 31 日

Round-Robin Beyond Additive Agents: Existence and Fairness of Approximate Equilibria

Georgios Amanatidis,Georgios Birmpas,Philip Lazos,Stefano Leonardi,Rebecca Reiffenh?user

Fair allocation of indivisible goods has attracted extensive attention over the last two decades, yielding numerous elegant algorithmic results and producing challenging open questions. The problem becomes much harder in the presence of strategic agents. Ideally, one would want to design truthful mechanisms that produce allocations with fairness guarantees. However, in the standard setting without monetary transfers, it is generally impossible to have truthful mechanisms that provide non-trivial fairness guarantees. Recently, Amanatidis et al. [2021] suggested the study of mechanisms that produce fair allocations in their equilibria. Specifically, when the agents have additive valuation functions, the simple Round-Robin algorithm always has pure Nash equilibria and the corresponding allocations are envy-free up to one good (EF1) with respect to the agents' true valuation functions. Following this agenda, we show that this outstanding property of the Round-Robin mechanism extends much beyond the above default assumption of additivity. In particular, we prove that for agents with cancelable valuation functions (a natural class that contains, e.g., additive and budget-additive functions), this simple mechanism always has equilibria and even its approximate equilibria correspond to approximately EF1 allocations with respect to the agents' true valuation functions. Further, we show that the approximate EF1 fairness of approximate equilibria surprisingly holds for the important class of submodular valuation functions as well, even though exact equilibria fail to exist!

約束 · CES · 相關系數 · 確切的 · 納什均衡 ·

2023 年 1 月 31 日

Constrained Phi-Equilibria

Martino Bernasconi,Matteo Castiglioni,Alberto Marchesi,Francesco Trovò,Nicola Gatti

The computational study of equilibria involving constraints on players' strategies has been largely neglected. However, in real-world applications, players are usually subject to constraints ruling out the feasibility of some of their strategies, such as, e.g., safety requirements and budget caps. Computational studies on constrained versions of the Nash equilibrium have lead to some results under very stringent assumptions, while finding constrained versions of the correlated equilibrium (CE) is still unexplored. In this paper, we introduce and computationally characterize constrained Phi-equilibria -- a more general notion than constrained CEs -- in normal-form games. We show that computing such equilibria is in general computationally intractable, and also that the set of the equilibria may not be convex, providing a sharp divide with unconstrained CEs. Nevertheless, we provide a polynomial-time algorithm for computing a constrained (approximate) Phi-equilibrium maximizing a given linear function, when either the number of constraints or that of players' actions is fixed. Moreover, in the special case in which a player's constraints do not depend on other players' strategies, we show that an exact, function-maximizing equilibrium can be computed in polynomial time, while one (approximate) equilibrium can be found with an efficient decentralized no-regret learning algorithm.

可交換的 · 泛函 · Integration · ASSETS · Agent ·

2023 年 1 月 31 日

Batch Exchanges with Constant Function Market Makers: Axioms, Equilibria, and Computation

Geoffrey Ramseyer,Mohak Goyal,Ashish Goel,David Mazières

from arxiv, 24 pages

Batch trading systems and constant function market makers (CFMMs) are two distinct market design innovations that have recently come to prominence as ways to address some of the shortcomings of decentralized trading systems. However, different deployments have chosen substantially different methods for integrating the two innovations. We show here from a minimal set of axioms describing the beneficial properties of each innovation that there is in fact only one, unique method for integrating CFMMs into batch trading schemes that preserves all the beneficial properties of both. Deployment of a batch trading schemes trading many assets simultaneously requires a reliable algorithm for approximating equilibria in Arrow-Debreu exchange markets. We study this problem when batches contain limit orders and CFMMs. Specifically, we find that CFMM design affects the asymptotic complexity of the problem, give an easily-checkable criterion to validate that a user-submitted CFMM is computationally tractable in a batch, and give a convex program that computes equilibria on batches of limit orders and CFMMs. Equivalently, this convex program computes equilibria of Arrow-Debreu exchange markets when every agent's demand response satisfies weak gross substitutability and every agent has utility for only two types of assets. This convex program has rational solutions when run on many (but not all) natural classes of widely-deployed CFMMs.

概念偏移 · 正則化項 · Weight · 樣本 · 穩健性 ·

2023 年 1 月 31 日

Adaptively Weighted Data Augmentation Consistency Regularization for Robust Optimization under Concept Shift

Yijun Dong,Yuege Xie,Rachel Ward

Concept shift is a prevailing problem in natural tasks like medical image segmentation where samples usually come from different subpopulations with variant correlations between features and labels. One common type of concept shift in medical image segmentation is the "information imbalance" between label-sparse samples with few (if any) segmentation labels and label-dense samples with plentiful labeled pixels. Existing distributionally robust algorithms have focused on adaptively truncating/down-weighting the "less informative" (i.e., label-sparse in our context) samples. To exploit data features of label-sparse samples more efficiently, we propose an adaptively weighted online optimization algorithm -- AdaWAC -- to incorporate data augmentation consistency regularization in sample reweighting. Our method introduces a set of trainable weights to balance the supervised loss and unsupervised consistency regularization of each sample separately. At the saddle point of the underlying objective, the weights assign label-dense samples to the supervised loss and label-sparse samples to the unsupervised consistency regularization. We provide a convergence guarantee by recasting the optimization as online mirror descent on a saddle point problem. Our empirical results demonstrate that AdaWAC not only enhances the segmentation performance and sample efficiency but also improves the robustness to concept shift on various medical image segmentation tasks with different UNet-style backbones.

圖像配準 · 學成 · 監督 · surge · 講稿 ·

2022 年 4 月 24 日

Deep Learning for Medical Image Registration: A Comprehensive Review

Subrato Bharati,M. Rubaiyat Hossain Mondal,Prajoy Podder,V. B. Surya Prasath

from arxiv, 18 pages, 7 figures

Image registration is a critical component in the applications of various medical image analyses. In recent years, there has been a tremendous surge in the development of deep learning (DL)-based medical image registration models. This paper provides a comprehensive review of medical image registration. Firstly, a discussion is provided for supervised registration categories, for example, fully supervised, dual supervised, and weakly supervised registration. Next, similarity-based as well as generative adversarial network (GAN)-based registration are presented as part of unsupervised registration. Deep iterative registration is then described with emphasis on deep similarity-based and reinforcement learning-based registration. Moreover, the application areas of medical image registration are reviewed. This review focuses on monomodal and multimodal registration and associated imaging, for instance, X-ray, CT scan, ultrasound, and MRI. The existing challenges are highlighted in this review, where it is shown that a major challenge is the absence of a training dataset with known transformations. Finally, a discussion is provided on the promising future research areas in the field of DL-based medical image registration.

博弈論 · Performance · MoDELS · 學成 · 平滑 ·

2020 年 12 月 15 日

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Stefanos Leonardos,Georgios Piliouras

from arxiv, Appears in the 35th AAAI Conference on Artificial Intelligence

Exploration-exploitation is a powerful and practical tool in multi-agent learning (MAL), however, its effects are far from understood. To make progress in this direction, we study a smooth analogue of Q-learning. We start by showing that our learning model has strong theoretical justification as an optimal model for studying exploration-exploitation. Specifically, we prove that smooth Q-learning has bounded regret in arbitrary games for a cost model that explicitly captures the balance between game and exploration costs and that it always converges to the set of quantal-response equilibria (QRE), the standard solution concept for games under bounded rationality, in weighted potential games with heterogeneous learning agents. In our main task, we then turn to measure the effect of exploration in collective system performance. We characterize the geometry of the QRE surface in low-dimensional MAL systems and link our findings with catastrophe (bifurcation) theory. In particular, as the exploration hyperparameter evolves over-time, the system undergoes phase transitions where the number and stability of equilibria can change radically given an infinitesimal change to the exploration parameter. Based on this, we provide a formal theoretical treatment of how tuning the exploration parameter can provably lead to equilibrium selection with both positive as well as negative (and potentially unbounded) effects to system performance.

條件隨機場 · 隨機場 · INFORMS · 圖像分割 · 卷積神經網絡 ·

2017 年 12 月 27 日

Conditional Random Field and Deep Feature Learning for Hyperspectral Image Segmentation

Fahim Irfan Alam,Jun Zhou,Alan Wee-Chung Liew,Xiuping Jia,Jocelyn Chanussot,Yongsheng Gao

from arxiv, Submitted for Journal (Version 2)

Image segmentation is considered to be one of the critical tasks in hyperspectral remote sensing image processing. Recently, convolutional neural network (CNN) has established itself as a powerful model in segmentation and classification by demonstrating excellent performances. The use of a graphical model such as a conditional random field (CRF) contributes further in capturing contextual information and thus improving the segmentation performance. In this paper, we propose a method to segment hyperspectral images by considering both spectral and spatial information via a combined framework consisting of CNN and CRF. We use multiple spectral cubes to learn deep features using CNN, and then formulate deep CRF with CNN-based unary and pairwise potential functions to effectively extract the semantic correlations between patches consisting of three-dimensional data cubes. Effective piecewise training is applied in order to avoid the computationally expensive iterative CRF inference. Furthermore, we introduce a deep deconvolution network that improves the segmentation masks. We also introduce a new dataset and experimented our proposed method on it along with several widely adopted benchmark datasets to evaluate the effectiveness of our method. By comparing our results with those from several state-of-the-art models, we show the promising potential of our method.