We consider the problem of computing mixed Nash equilibria of two-player zero-sum games with continuous sets of pure strategies and with first-order access to the payoff function. This problem arises for example in game-theory-inspired machine learning applications, such as distributionally-robust learning. In those applications, the strategy sets are high-dimensional and thus methods based on discretisation cannot tractably return high-accuracy solutions. In this paper, we introduce and analyze a particle-based method that enjoys guaranteed local convergence for this problem. This method consists in parametrizing the mixed strategies as atomic measures and applying proximal point updates to both the atoms' weights and positions. It can be interpreted as a time-implicit discretization of the "interacting" Wasserstein-Fisher-Rao gradient flow. We prove that, under non-degeneracy assumptions, this method converges at an exponential rate to the exact mixed Nash equilibrium from any initialization satisfying a natural notion of closeness to optimality. We illustrate our results with numerical experiments and discuss applications to max-margin and distributionally-robust classification using two-layer neural networks, where our method has a natural interpretation as a simultaneous training of the network's weights and of the adversarial distribution.
We explore the features of two methods of stabilization, aggregation and supremizer methods, for reduced-order modeling of parametrized optimal control problems. In both methods, the reduced basis spaces are augmented to guarantee stability. For the aggregation method, the reduced basis approximation spaces for the state and adjoint variables are augmented in such a way that the spaces are identical. For the supremizer method, the reduced basis approximation space for the state-control product space is augmented with the solutions of a supremizer equation. We implement both of these methods for solving several parametrized control problems and assess their performance. Results indicate that the number of reduced basis vectors needed to approximate the solution space to some tolerance with the supremizer method is much larger, possibly double, that for aggregation. There are also some cases where the supremizer method fails to produce a converged solution. We present results to compare the accuracy, efficiency, and computational costs associated with both methods of stabilization which suggest that stabilization by aggregation is a superior stabilization method for control problems.
The main focus of this paper is the study of efficient multigrid methods for large linear systems with a particular saddle-point structure. Indeed, when the system matrix is symmetric, but indefinite, the variational convergence theory that is usually used to prove multigrid convergence cannot be directly applied. However, different algebraic approaches analyze properly preconditioned saddle-point problems, proving convergence of the Two-Grid method. In particular, this is efficient when the blocks of the coefficient matrix possess a Toeplitz or circulant structure. Indeed, it is possible to derive sufficient conditions for convergence and provide optimal parameters for the preconditioning of the saddle-point problem in terms of the associated generating symbols. In this paper, we propose a symbol-based convergence analysis for problems that have a hidden block Toeplitz structure. Then, they can be investigated focusing on the properties of the associated generating function f, which consequently is a matrix-valued function with dimension depending on the block size of the problem. As numerical tests we focus on the matrix sequence stemming from the finite element approximation of the Stokes problem. We show the efficiency of the methods studying the hidden 9-by-9 block multilevel structure of the obtained matrix sequence. Moreover, we propose an efficient algebraic multigrid method with convergence rate independent of the matrix size. Finally, we present several numerical tests comparing the results with state-of-the-art strategies.
In this paper, we investigate the power of {\it regularization}, a common technique in reinforcement learning and optimization, in solving extensive-form games (EFGs). We propose a series of new algorithms based on regularizing the payoff functions of the game, and establish a set of convergence results that strictly improve over the existing ones, with either weaker assumptions or stronger convergence guarantees. In particular, we first show that dilated optimistic mirror descent (DOMD), an efficient variant of OMD for solving EFGs, with adaptive regularization can achieve a fast $\tilde O(1/T)$ last-iterate convergence in terms of duality gap and distance to the set of Nash equilibrium (NE) without uniqueness assumption of the NE. Second, we show that regularized counterfactual regret minimization (\texttt{Reg-CFR}), with a variant of optimistic mirror descent algorithm as regret-minimizer, can achieve $O(1/T^{1/4})$ best-iterate, and $O(1/T^{3/4})$ average-iterate convergence rate for finding NE in EFGs. Finally, we show that \texttt{Reg-CFR} can achieve asymptotic last-iterate convergence, and optimal $O(1/T)$ average-iterate convergence rate, for finding the NE of perturbed EFGs, which is useful for finding approximate extensive-form perfect equilibria (EFPE). To the best of our knowledge, they constitute the first last-iterate convergence results for CFR-type algorithms, while matching the state-of-the-art average-iterate convergence rate in finding NE for non-perturbed EFGs. We also provide numerical results to corroborate the advantages of our algorithms.
We introduce a class of networked Markov potential games where agents are associated with nodes in a network. Each agent has its own local potential function, and the reward of each agent depends only on the states and actions of agents within a $\kappa$-hop neighborhood. In this context, we propose a localized actor-critic algorithm. The algorithm is scalable since each agent uses only local information and does not need access to the global state. Further, the algorithm overcomes the curse of dimensionality through the use of function approximation. Our main results provide finite-sample guarantees up to a localization error and a function approximation error. Specifically, we achieve an $\tilde{\mathcal{O}}(\epsilon^{-4})$ sample complexity measured by the averaged Nash regret. This is the first finite-sample bound for multi-agent competitive games that does not depend on the number of agents.
We address differential privacy for fully distributed aggregative games with shared coupling constraints. By co-designing the generalized Nash equilibrium (GNE) seeking mechanism and the differential-privacy noise injection mechanism, we propose the first GNE seeking algorithm that can ensure both provable convergence to the GNE and rigorous epsilon-differential privacy, even with the number of iterations tending to infinity. As a basis of the co-design, we also propose a new consensus-tracking algorithm that can achieve rigorous epsilon-differential privacy while maintaining accurate tracking performance, which, to our knowledge, has not been achieved before. To facilitate the convergence analysis, we also establish a general convergence result for stochastically-perturbed nonstationary fixed-point iteration processes, which lie at the core of numerous optimization and variational problems. Numerical simulation results confirm the effectiveness of the proposed approach.
In this paper, we exploit a result in point process theory, knowing the expected value of the $K$-function weighted by the true first-order intensity function. This theoretical result can serve as an estimation method for obtaining the parameters estimates of a specific model, assumed for the data. The motivation is to generally avoid dealing with the complex likelihoods of some complex point processes models and their maximization. This can be more evident when considering the local second-order characteristics, since the proposed method can estimate the vector of the local parameters, one for each point of the analysed point pattern. We illustrate the method through simulation studies for both purely spatial and spatio-temporal point processes.
This work considers Gaussian process interpolation with a periodized version of the Mat{\'e}rn covariance function introduced by Stein (22, Section 6.7). Convergence rates are studied for the joint maximum likelihood estimation of the regularity and the amplitude parameters when the data is sampled according to the model. The mean integrated squared error is also analyzed with fixed and estimated parameters, showing that maximum likelihood estimation yields asymptotically the same error as if the ground truth was known. Finally, the case where the observed function is a fixed deterministic element of a Sobolev space of continuous functions is also considered, suggesting that bounding assumptions on some parameters can lead to different estimates.
Emotion recognition in conversation (ERC) aims to detect the emotion label for each utterance. Motivated by recent studies which have proven that feeding training examples in a meaningful order rather than considering them randomly can boost the performance of models, we propose an ERC-oriented hybrid curriculum learning framework. Our framework consists of two curricula: (1) conversation-level curriculum (CC); and (2) utterance-level curriculum (UC). In CC, we construct a difficulty measurer based on "emotion shift" frequency within a conversation, then the conversations are scheduled in an "easy to hard" schema according to the difficulty score returned by the difficulty measurer. For UC, it is implemented from an emotion-similarity perspective, which progressively strengthens the model's ability in identifying the confusing emotions. With the proposed model-agnostic hybrid curriculum learning strategy, we observe significant performance boosts over a wide range of existing ERC models and we are able to achieve new state-of-the-art results on four public ERC datasets.
As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.
Bid optimization for online advertising from single advertiser's perspective has been thoroughly investigated in both academic research and industrial practice. However, existing work typically assume competitors do not change their bids, i.e., the wining price is fixed, leading to poor performance of the derived solution. Although a few studies use multi-agent reinforcement learning to set up a cooperative game, they still suffer the following drawbacks: (1) They fail to avoid collusion solutions where all the advertisers involved in an auction collude to bid an extremely low price on purpose. (2) Previous works cannot well handle the underlying complex bidding environment, leading to poor model convergence. This problem could be amplified when handling multiple objectives of advertisers which are practical demands but not considered by previous work. In this paper, we propose a novel multi-objective cooperative bid optimization formulation called Multi-Agent Cooperative bidding Games (MACG). MACG sets up a carefully designed multi-objective optimization framework where different objectives of advertisers are incorporated. A global objective to maximize the overall profit of all advertisements is added in order to encourage better cooperation and also to protect self-bidding advertisers. To avoid collusion, we also introduce an extra platform revenue constraint. We analyze the optimal functional form of the bidding formula theoretically and design a policy network accordingly to generate auction-level bids. Then we design an efficient multi-agent evolutionary strategy for model optimization. Offline experiments and online A/B tests conducted on the Taobao platform indicate both single advertiser's objective and global profit have been significantly improved compared to state-of-art methods.