亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

The works of (Daskalakis et al., 2009, 2022; Jin et al., 2022; Deng et al., 2023) indicate that computing Nash equilibria in multi-player Markov games is a computationally hard task. This fact raises the question of whether or not computational intractability can be circumvented if one focuses on specific classes of Markov games. One such example is two-player zero-sum Markov games, in which efficient ways to compute a Nash equilibrium are known. Inspired by zero-sum polymatrix normal-form games (Cai et al., 2016), we define a class of zero-sum multi-agent Markov games in which there are only pairwise interactions described by a graph that changes per state. For this class of Markov games, we show that an $\epsilon$-approximate Nash equilibrium can be found efficiently. To do so, we generalize the techniques of (Cai et al., 2016), by showing that the set of coarse-correlated equilibria collapses to the set of Nash equilibria. Afterwards, it is possible to use any algorithm in the literature that computes approximate coarse-correlated equilibria Markovian policies to get an approximate Nash equilibrium.

相關內容

We assume to be given structural equations over discrete variables inducing a directed acyclic graph, namely, a structural causal model, together with data about its internal nodes. The question we want to answer is how we can compute bounds for partially identifiable counterfactual queries from such an input. We start by giving a map from structural casual models to credal networks. This allows us to compute exact counterfactual bounds via algorithms for credal nets on a subclass of structural causal models. Exact computation is going to be inefficient in general given that, as we show, causal inference is NP-hard even on polytrees. We target then approximate bounds via a causal EM scheme. We evaluate their accuracy by providing credible intervals on the quality of the approximation; we show through a synthetic benchmark that the EM scheme delivers accurate results in a fair number of runs. In the course of the discussion, we also point out what seems to be a neglected limitation to the trending idea that counterfactual bounds can be computed without knowledge of the structural equations. We also present a real case study on palliative care to show how our algorithms can readily be used for practical purposes.

Motivated by bid recommendation in online ad auctions, this paper considers a general class of multi-level and multi-agent games, with two major characteristics: one is a large number of anonymous agents, and the other is the intricate interplay between competition and cooperation. To model such complex systems, we propose a novel and tractable bi-objective optimization formulation with mean-field approximation, called MESOB (Mean-field Equilibria & Social Optimality Balancing), as well as an associated occupation measure optimization (OMO) method called MESOB-OMO to solve it. MESOB-OMO enables obtaining approximately Pareto efficient solutions in terms of the dual objectives of competition and cooperation in MESOB, and in particular allows for Nash equilibrium selection and social equalization in an asymptotic manner. We apply MESOB-OMO to bid recommendation in a simulated pay-per-click ad auction. Experiments demonstrate its efficacy in balancing the interests of different parties and in handling the competitive nature of bidders, as well as its advantages over baselines that only consider either the competitive or the cooperative aspects.

Lawmakers around the country are crafting new laws to target "dark patterns" -- user interface designs that trick or coerce users into enabling cell phone location tracking, sharing browsing data, initiating automatic billing, or making whatever other choices their designers prefer. Dark patterns pose a serious problem. In their most aggressive forms, they interfere with human autonomy, undermine customers' evaluation and selection of products, and distort online markets for goods and services. Yet crafting legislation is a major challenge: Persuasion and deception are difficult to distinguish, and shifting tech trends present an ever-moving target. To address these challenges, this Article proposes leveraging state private law to define and track dark patterns as they evolve. Judge-crafted decisional law can respond quickly to new techniques, flexibly define the boundary between permissible and impermissible designs, and bolster state and federal regulatory enforcement efforts by quickly identifying those designs that most undermine user autonomy.

We describe a simple deterministic near-linear time approximation scheme for uncapacitated minimum cost flow in undirected graphs with real edge weights, a problem also known as transshipment. Specifically, our algorithm takes as input a (connected) undirected graph $G = (V, E)$, vertex demands $b \in \mathbb{R}^V$ such that $\sum_{v \in V} b(v) = 0$, positive edge costs $c \in \mathbb{R}_{>0}^E$, and a parameter $\varepsilon > 0$. In $O(\varepsilon^{-2} m \log^{O(1)} n)$ time, it returns a flow $f$ such that the net flow out of each vertex is equal to the vertex's demand and the cost of the flow is within a $(1 + \varepsilon)$ factor of optimal. Our algorithm is combinatorial and has no running time dependency on the demands or edge costs. With the exception of a recent result presented at STOC 2022 for polynomially bounded edge weights, all almost- and near-linear time approximation schemes for transshipment relied on randomization in two main ways: 1) to embed the problem instance into low-dimensional space and 2) to randomly pick target locations to send flow so nearby opposing demands can be satisfied. Our algorithm instead deterministically approximates the cost of routing decisions that would be made if the input were subject to a random tree embedding. To avoid computing the $\Omega(n^2)$ vertex-vertex distances that an approximation of this kind suggests, we also limit the available routing decisions using distances explicitly stored in the well-known Thorup-Zwick distance oracle.

We study two natural problems about rational behaviors in multiplayer non-zero-sum sequential infinite duration games played on graphs: checking problems, that consist in deciding whether a strategy profile, defined by a Mealy machine, is rational; and rational verification, that consists in deciding whether all the rational answers to a given strategy satisfy some specification. We give the complexities of those problems for two major concepts of rationality: Nash equilibria and subgame-perfect equilibria, and for five major classes of payoff functions: parity, mean-payoff, quantitative reachability, energy, and discounted-sum.

We introduce a game model called "customer attraction game" to demonstrate the competition among online content providers. In this model, customers exhibit interest in various topics. Each content provider selects one topic and benefits from the attracted customers. We investigate both symmetric and asymmetric settings involving agents and customers. In the symmetric setting, the existence of pure Nash equilibrium (PNE) is guaranteed, but finding a PNE is PLS-complete. To address this, we propose a fully polynomial time approximation scheme to identify an approximate PNE. Moreover, the tight Price of Anarchy (PoA) is established. In the asymmetric setting, we show the nonexistence of PNE in certain instances and establish that determining its existence is NP-hard. Nevertheless, we prove the existence of an approximate PNE. Additionally, when agents select topics sequentially, we demonstrate that finding a subgame-perfect equilibrium is PSPACE-hard. Furthermore, we present the sequential PoA for the two-agent setting.

The automatic classification of 3D medical data is memory-intensive. Also, variations in the number of slices between samples is common. Naive solutions such as subsampling can solve these problems, but at the cost of potentially eliminating relevant diagnosis information. Transformers have shown promising performance for sequential data analysis. However, their application for long-sequences is data, computationally, and memory demanding. In this paper, we propose an end-to-end Transformer-based framework that allows to classify volumetric data of variable length in an efficient fashion. Particularly, by randomizing the input slice-wise resolution during training, we enhance the capacity of the learnable positional embedding assigned to each volume slice. Consequently, the accumulated positional information in each positional embedding can be generalized to the neighbouring slices, even for high resolution volumes at the test time. By doing so, the model will be more robust to variable volume length and amenable to different computational budgets. We evaluated the proposed approach in retinal OCT volume classification and achieved 21.96% average improvement in balanced accuracy on a 9-class diagnostic task, compared to state-of-the-art video transformers. Our findings show that varying the slice-wise resolution of the input during training results in more informative volume representation as compared to training with fixed number of slices per volume. Our code is available at: //github.com/marziehoghbaie/VLFAT.

Game theory has by now found numerous applications in various fields, including economics, industry, jurisprudence, and artificial intelligence, where each player only cares about its own interest in a noncooperative or cooperative manner, but without obvious malice to other players. However, in many practical applications, such as poker, chess, evader pursuing, drug interdiction, coast guard, cyber-security, and national defense, players often have apparently adversarial stances, that is, selfish actions of each player inevitably or intentionally inflict loss or wreak havoc on other players. Along this line, this paper provides a systematic survey on three main game models widely employed in adversarial games, i.e., zero-sum normal-form and extensive-form games, Stackelberg (security) games, zero-sum differential games, from an array of perspectives, including basic knowledge of game models, (approximate) equilibrium concepts, problem classifications, research frontiers, (approximate) optimal strategy seeking techniques, prevailing algorithms, and practical applications. Finally, promising future research directions are also discussed for relevant adversarial games.

We introduce DeepNash, an autonomous agent capable of learning to play the imperfect information game Stratego from scratch, up to a human expert level. Stratego is one of the few iconic board games that Artificial Intelligence (AI) has not yet mastered. This popular game has an enormous game tree on the order of $10^{535}$ nodes, i.e., $10^{175}$ times larger than that of Go. It has the additional complexity of requiring decision-making under imperfect information, similar to Texas hold'em poker, which has a significantly smaller game tree (on the order of $10^{164}$ nodes). Decisions in Stratego are made over a large number of discrete actions with no obvious link between action and outcome. Episodes are long, with often hundreds of moves before a player wins, and situations in Stratego can not easily be broken down into manageably-sized sub-problems as in poker. For these reasons, Stratego has been a grand challenge for the field of AI for decades, and existing AI methods barely reach an amateur level of play. DeepNash uses a game-theoretic, model-free deep reinforcement learning method, without search, that learns to master Stratego via self-play. The Regularised Nash Dynamics (R-NaD) algorithm, a key component of DeepNash, converges to an approximate Nash equilibrium, instead of 'cycling' around it, by directly modifying the underlying multi-agent learning dynamics. DeepNash beats existing state-of-the-art AI methods in Stratego and achieved a yearly (2022) and all-time top-3 rank on the Gravon games platform, competing with human expert players.

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

北京阿比特科技有限公司