亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We consider an atomic congestion game in which each player participates in the game with an exogenous and known probability $p_{i}\in[0,1]$, independently of everybody else, or stays out and incurs no cost. We first prove that the resulting game is potential. Then, we compute the parameterized price of anarchy to characterize the impact of demand uncertainty on the efficiency of selfish behavior. It turns out that the price of anarchy as a function of the maximum participation probability $p=\max_{i} p_{i}$ is a nondecreasing function. The worst case is attained when players have the same participation probabilities $p_{i}=p$. For the case of affine costs, we provide an analytic expression for the parameterized price of anarchy as a function of $p$. This function is continuous on $(0,1]$, is equal to $4/3$ for $0<p\leq 1/4$, and increases towards $5/2$ when $p\to 1$. Our work can be interpreted as providing a continuous transition between the price of anarchy of nonatomic and atomic games, which are the extremes of the price of anarchy function we characterize. We show that these bounds are tight and are attained on routing games -- as opposed to general congestion games -- with purely linear costs (i.e., with no constant terms).

相關內容

讓 iOS 8 和 OS X Yosemite 無縫切換的一個新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source:

In this paper, we study learning in probabilistic domains where the learner may receive incorrect labels but can improve the reliability of labels by repeatedly sampling them. In such a setting, one faces the problem of whether the fixed budget for obtaining training examples should rather be used for obtaining all different examples or for improving the label quality of a smaller number of examples by re-sampling their labels. We motivate this problem in an application to compare the strength of poker hands where the training signal depends on the hidden community cards, and then study it in depth in an artificial setting where we insert controlled noise levels into the MNIST database. Our results show that with increasing levels of noise, resampling previous examples becomes increasingly more important than obtaining new examples, as classifier performance deteriorates when the number of incorrect labels is too high. In addition, we propose two different validation strategies; switching from lower to higher validations over the course of training and using chi-square statistics to approximate the confidence in obtained labels.

In this perspective paper we study the effect of non independent and identically distributed (non-IID) data on federated online learning to rank (FOLTR) and chart directions for future work in this new and largely unexplored research area of Information Retrieval. In the FOLTR process, clients join a federation to jointly create an effective ranker from the implicit click signal originating in each client, without the need to share data (documents, queries, clicks). A well-known factor that affects the performance of federated learning systems, and that poses serious challenges to these approaches, is the fact that there may be some type of bias in the way the data is distributed across clients. While FOLTR systems are on their own rights a type of federated learning system, the presence and effect of non-IID data in FOLTR has not been studied. To this aim, we first enumerate possible data distribution settings that may showcase data bias across clients and thus give rise to the non-IID problem. Then, we study the impact of each of these settings on the performance of the current state-of-the-art FOLTR approach, the Federated Pairwise Differentiable Gradient Descent (FPDGD), and we highlight which data distributions may pose a problem for FOLTR methods. We also explore how common approaches proposed in the federated learning literature address non-IID issues in FOLTR. This allows us to unveil new research gaps that, we argue, future research in FOLTR should consider. This is an important contribution to the current state of the field of FOLTR because, for FOLTR systems to be deployed, the factors affecting their performance, including the impact of non-IID data, need to thoroughly be understood.

We study reinforcement learning for two-player zero-sum Markov games with simultaneous moves in the finite-horizon setting, where the transition kernel of the underlying Markov games can be parameterized by a linear function over the current state, both players' actions and the next state. In particular, we assume that we can control both players and aim to find the Nash Equilibrium by minimizing the duality gap. We propose an algorithm Nash-UCRL based on the principle "Optimism-in-Face-of-Uncertainty". Our algorithm only needs to find a Coarse Correlated Equilibrium (CCE), which is computationally efficient. Specifically, we show that Nash-UCRL can provably achieve an $\tilde{O}(dH\sqrt{T})$ regret, where $d$ is the linear function dimension, $H$ is the length of the game and $T$ is the total number of steps in the game. To assess the optimality of our algorithm, we also prove an $\tilde{\Omega}( dH\sqrt{T})$ lower bound on the regret. Our upper bound matches the lower bound up to logarithmic factors, which suggests the optimality of our algorithm.

The monotone variational inequality is a central problem in mathematical programming that unifies and generalizes many important settings such as smooth convex optimization, two-player zero-sum games, convex-concave saddle point problems, etc. The extragradient method by Korpelevich [1976] is one of the most popular methods for solving monotone variational inequalities. Despite its long history and intensive attention from the optimization and machine learning community, the following major problem remains open. What is the last-iterate convergence rate of the extragradient method for monotone and Lipschitz variational inequalities with constraints? We resolve this open problem by showing a tight $O\left(\frac{1}{\sqrt{T}}\right)$ last-iterate convergence rate for arbitrary convex feasible sets, which matches the lower bound by Golowich et al. [2020]. Our rate is measured in terms of the standard gap function. The technical core of our result is the monotonicity of a new performance measure -- the tangent residual, which can be viewed as an adaptation of the norm of the operator that takes the local constraints into account. To establish the monotonicity, we develop a new approach that combines the power of the sum-of-squares programming with the low dimensionality of the update rule of the extragradient method. We believe our approach has many additional applications in the analysis of iterative methods.

Emerging distributed cloud architectures, e.g., fog and mobile edge computing, are playing an increasingly important role in the efficient delivery of real-time stream-processing applications such as augmented reality, multiplayer gaming, and industrial automation. While such applications require processed streams to be shared and simultaneously consumed by multiple users/devices, existing technologies lack efficient mechanisms to deal with their inherent multicast nature, leading to unnecessary traffic redundancy and network congestion. In this paper, we establish a unified framework for distributed cloud network control with generalized (mixed-cast) traffic flows that allows optimizing the distributed execution of the required packet processing, forwarding, and replication operations. We first characterize the enlarged multicast network stability region under the new control framework (with respect to its unicast counterpart). We then design a novel queuing system that allows scheduling data packets according to their current destination sets, and leverage Lyapunov drift-plus-penalty theory to develop the first fully decentralized, throughput- and cost-optimal algorithm for multicast cloud network flow control. Numerical experiments validate analytical results and demonstrate the performance gain of the proposed design over existing cloud network control techniques.

The problem of Byzantine consensus has been key to designing secure distributed systems. However, it is particularly difficult, mainly due to the presence of Byzantine processes that act arbitrarily and the unknown message delays in general networks. Although it is well known that both safety and liveness are at risk as soon as $n/3$ Byzantine processes fail, very few works attempted to characterize precisely the faults that produce safety violations from the faults that produce termination violations. In this paper, we present a new lower bound on the solvability of the consensus problem by distinguishing deceitful faults violating safety and benign faults violating termination from the more general Byzantine faults, in what we call the Byzantine-deceitful-benign fault model. We show that one cannot solve consensus if $n\leq 3t+d+2q$ with $t$ Byzantine processes, $d$ deceitful processes, and $q$ benign processes. In addition, we show that this bound is tight by presenting the Basilic class of consensus protocols that solve consensus when $n > 3t+d+2q$. These protocols differ in the number of processes from which they wait to receive messages before progressing. Each of these protocols is thus better suited for some applications depending on the predominance of benign or deceitful faults. Finally, we study the fault tolerance of the Basilic class of consensus protocols in the context of blockchains that need to solve the weaker problem of eventual consensus. We demonstrate that Basilic solves this problem with only $n > 2t+d+q$, hence demonstrating how it can strengthen blockchain security.

We study the problem of testing whether a function $f: \mathbb{R}^n \to \mathbb{R}$ is a polynomial of degree at most $d$ in the \emph{distribution-free} testing model. Here, the distance between functions is measured with respect to an unknown distribution $\mathcal{D}$ over $\mathbb{R}^n$ from which we can draw samples. In contrast to previous work, we do not assume that $\mathcal{D}$ has finite support. We design a tester that given query access to $f$, and sample access to $\mathcal{D}$, makes $(d/\varepsilon)^{O(1)}$ many queries to $f$, accepts with probability $1$ if $f$ is a polynomial of degree $d$, and rejects with probability at least $2/3$ if every degree-$d$ polynomial $P$ disagrees with $f$ on a set of mass at least $\varepsilon$ with respect to $\mathcal{D}$. Our result also holds under mild assumptions when we receive only a polynomial number of bits of precision for each query to $f$, or when $f$ can only be queried on rational points representable using a logarithmic number of bits. Along the way, we prove a new stability theorem for multivariate polynomials that may be of independent interest.

We consider networks of small, autonomous devices that communicate with each other wirelessly. Minimizing energy usage is an important consideration in designing algorithms for such networks, as battery life is a crucial and limited resource. Working in a model where both sending and listening for messages deplete energy, we consider the problem of finding a maximal matching of the nodes in a radio network of arbitrary and unknown topology. We present a distributed randomized algorithm that produces, with high probability, a maximal matching. The maximum energy cost per node is $O(\log^2 n)$, where $n$ is the size of the network. The total latency of our algorithm is $O(n \log n)$ time steps. We observe that there exist families of network topologies for which both of these bounds are simultaneously optimal up to polylog factors, so any significant improvement will require additional assumptions about the network topology. We also consider the related problem of assigning, for each node in the network, a neighbor to back up its data in case of node failure. Here, a key goal is to minimize the maximum load, defined as the number of nodes assigned to a single node. We present a decentralized low-energy algorithm that finds a neighbor assignment whose maximum load is at most a polylog($n$) factor bigger that the optimum.

The minimum energy path (MEP) describes the mechanism of reaction, and the energy barrier along the path can be used to calculate the reaction rate in thermal systems. The nudged elastic band (NEB) method is one of the most commonly used schemes to compute MEPs numerically. It approximates an MEP by a discrete set of configuration images, where the discretization size determines both computational cost and accuracy of the simulations. In this paper, we consider a discrete MEP to be a stationary state of the NEB method and prove an optimal convergence rate of the discrete MEP with respect to the number of images. Numerical simulations for the transitions of some several proto-typical model systems are performed to support the theory.

Exploration-exploitation is a powerful and practical tool in multi-agent learning (MAL), however, its effects are far from understood. To make progress in this direction, we study a smooth analogue of Q-learning. We start by showing that our learning model has strong theoretical justification as an optimal model for studying exploration-exploitation. Specifically, we prove that smooth Q-learning has bounded regret in arbitrary games for a cost model that explicitly captures the balance between game and exploration costs and that it always converges to the set of quantal-response equilibria (QRE), the standard solution concept for games under bounded rationality, in weighted potential games with heterogeneous learning agents. In our main task, we then turn to measure the effect of exploration in collective system performance. We characterize the geometry of the QRE surface in low-dimensional MAL systems and link our findings with catastrophe (bifurcation) theory. In particular, as the exploration hyperparameter evolves over-time, the system undergoes phase transitions where the number and stability of equilibria can change radically given an infinitesimal change to the exploration parameter. Based on this, we provide a formal theoretical treatment of how tuning the exploration parameter can provably lead to equilibrium selection with both positive as well as negative (and potentially unbounded) effects to system performance.

北京阿比特科技有限公司