In this discussion draft, we explore different duopoly games of players with quadratic costs, where the market is supposed to have the isoelastic demand. Different from the usual approaches based on numerical computations, the methods used in the present work are built on symbolic computations, which can produce analytical and rigorous results. Our investigation shows that the stability regions are enlarged for the games considered in this work compared to their counterparts with linear costs.
Motivated by the problem of exploring discrete but very complex state spaces in Bayesian models, we propose a novel Markov Chain Monte Carlo search algorithm: the taxicab sampler. We describe the construction of this sampler and discuss how its interpretation and usage differs from that of standard Metropolis-Hastings as well as the related Hamming ball sampler. The proposed sampling algorithm is then shown to demonstrate substantial improvement in computation time without any loss of efficiency relative to a na\"ive Metropolis-Hastings search in a motivating Bayesian regression tree count model, in which we leverage the discrete state space assumption to construct a novel likelihood function that allows for flexibly describing different mean-variance relationships while preserving parameter interpretability compared to existing likelihood functions for count data.
We study algorithmic applications of a natural discretization for the hard-sphere model and the Widom-Rowlinson model in a region $\mathbb{V}\subset\mathbb{R}^d$. These models are used in statistical physics to describe mixtures of one or multiple particle types subjected to hard-core interactions. For each type, particles follow a Poisson point process with a type specific activity parameter (fugacity). The Gibbs distribution is characterized by the mixture of these point processes conditioned that no two particles are closer than a type-dependent distance threshold. A key part in better understanding the Gibbs distribution is its normalizing constant, called partition function. We give sufficient conditions that the partition function of a discrete hard-core model on a geometric graph based on a point set $X \subset \mathbb{V}$ closely approximates those of such continuous models. Previously, this was only shown for the hard-sphere model on cubic regions $\mathbb{V}=[0, \ell)^d$ when $X$ is exponential in the volume of the region $\nu(\mathbb{V})$, limiting algorithmic applications. In the same setting, our refined analysis only requires a quadratic number of points, which we argue to be tight. We use our improved discretization results to approximate the partition functions of the hard-sphere model and the Widom-Rowlinson efficiently in $\nu(\mathbb{V})$. For the hard-sphere model, we obtain the first quasi-polynomial deterministic approximation algorithm for the entire fugacity regime for which, so far, only randomized approximations are known. Furthermore, we simplify a recently introduced fully polynomial randomized approximation algorithm. Similarly, we obtain the best known deterministic and randomized approximation bounds for the Widom-Rowlinson model. Moreover, we obtain approximate sampling algorithms for the respective spin systems within the same fugacity regimes.
In this work, we study stochastic non-cooperative games, where only noisy black-box function evaluations are available to estimate the cost function for each player. Since each player's cost function depends on both its own decision variables and its rivals' decision variables, local information needs to be exchanged through a center/network in most existing work for seeking the Nash equilibrium. We propose a new stochastic distributed learning algorithm that does not require communications among players. The proposed algorithm uses simultaneous perturbation method to estimate the gradient of each cost function, and uses mirror descent method to search for the Nash equilibrium. We provide asymptotic analysis for the bias and variance of gradient estimates, and show the proposed algorithm converges to the Nash equilibrium in mean square for the class of strictly monotone games at a rate faster than the existing algorithms. The effectiveness of the proposed method is buttressed in a numerical experiment.
When modeling longitudinal biomedical data, often dimensionality reduction as well as dynamic modeling in the resulting latent representation is needed. This can be achieved by artificial neural networks for dimension reduction, and differential equations for dynamic modeling of individual-level trajectories. However, such approaches so far assume that parameters of individual-level dynamics are constant throughout the observation period. Motivated by an application from psychological resilience research, we propose an extension where different sets of differential equation parameters are allowed for observation sub-periods. Still, estimation for intra-individual sub-periods is coupled for being able to fit the model also with a relatively small dataset. We subsequently derive prediction targets from individual dynamic models of resilience in the application. These serve as interpretable resilience-related outcomes, to be predicted from characteristics of individuals, measured at baseline and a follow-up time point, and selecting a small set of important predictors. Our approach is seen to successfully identify individual-level parameters of dynamic models that allows us to stably select predictors, i.e., resilience factors. Furthermore, we can identify those characteristics of individuals that are the most promising for updates at follow-up, which might inform future study design. This underlines the usefulness of our proposed deep dynamic modeling approach with changes in parameters between observation sub-periods.
This paper proposes a regularization of the Monge-Amp\`ere equation in planar convex domains through uniformly elliptic Hamilton-Jacobi-Bellman equations. The regularized problem possesses a unique strong solution $u_\varepsilon$ and is accessible to the discretization with finite elements. This work establishes locally uniform convergence of $u_\varepsilon$ to the convex Alexandrov solution $u$ to the Monge-Amp\`ere equation as the regularization parameter $\varepsilon$ approaches $0$. A mixed finite element method for the approximation of $u_\varepsilon$ is proposed, and the regularized finite element scheme is shown to be locally uniformly convergent. Numerical experiments provide empirical evidence for the efficient approximation of singular solutions $u$.
The inverse problem of supervised reconstruction of depth-variable (time-dependent) parameters in a neural ordinary differential equation (NODE) is considered, that means finding the weights of a residual network with time continuous layers. The NODE is treated as an isolated entity describing the full network as opposed to earlier research, which embedded it between pre- and post-appended layers trained by conventional methods. The proposed parameter reconstruction is done for a general first order differential equation by minimizing a cost functional covering a variety of loss functions and penalty terms. A nonlinear conjugate gradient method (NCG) is derived for the minimization. Mathematical properties are stated for the differential equation and the cost functional. The adjoint problem needed is derived together with a sensitivity problem. The sensitivity problem can estimate changes in the network output under perturbation of the trained parameters. To preserve smoothness during the iterations the Sobolev gradient is calculated and incorporated. As a proof-of-concept, numerical results are included for a NODE and two synthetic datasets, and compared with standard gradient approaches (not based on NODEs). The results show that the proposed method works well for deep learning with infinite numbers of layers, and has built-in stability and smoothness.
The existence of simple, uncoupled no-regret dynamics that converge to correlated equilibria in normal-form games is a celebrated result in the theory of multi-agent systems. Specifically, it has been known for more than 20 years that when all players seek to minimize their internal regret in a repeated normal-form game, the empirical frequency of play converges to a normal-form correlated equilibrium. Extensive-form (that is, tree-form) games generalize normal-form games by modeling both sequential and simultaneous moves, as well as private information. Because of the sequential nature and presence of partial information in the game, extensive-form correlation has significantly different properties than the normal-form counterpart, many of which are still open research directions. Extensive-form correlated equilibrium (EFCE) has been proposed as the natural extensive-form counterpart to normal-form correlated equilibrium. However, it was currently unknown whether EFCE emerges as the result of uncoupled agent dynamics. In this paper, we give the first uncoupled no-regret dynamics that converge to the set of EFCEs in $n$-player general-sum extensive-form games with perfect recall. First, we introduce a notion of trigger regret in extensive-form games, which extends that of internal regret in normal-form games. When each player has low trigger regret, the empirical frequency of play is close to an EFCE. Then, we give an efficient no-trigger-regret algorithm. Our algorithm decomposes trigger regret into local subproblems at each decision point for the player, and constructs a global strategy of the player from the local solutions at each decision point.
Human conversation is a complex mechanism with subtle nuances. It is hence an ambitious goal to develop artificial intelligence agents that can participate fluently in a conversation. While we are still far from achieving this goal, recent progress in visual question answering, image captioning, and visual question generation shows that dialog systems may be realizable in the not too distant future. To this end, a novel dataset was introduced recently and encouraging results were demonstrated, particularly for question answering. In this paper, we demonstrate a simple symmetric discriminative baseline, that can be applied to both predicting an answer as well as predicting a question. We show that this method performs on par with the state of the art, even memory net based methods. In addition, for the first time on the visual dialog dataset, we assess the performance of a system asking questions, and demonstrate how visual dialog can be generated from discriminative question generation and question answering.
Options in reinforcement learning allow agents to hierarchically decompose a task into subtasks, having the potential to speed up learning and planning. However, autonomously learning effective sets of options is still a major challenge in the field. In this paper we focus on the recently introduced idea of using representation learning methods to guide the option discovery process. Specifically, we look at eigenoptions, options obtained from representations that encode diffusive information flow in the environment. We extend the existing algorithms for eigenoption discovery to settings with stochastic transitions and in which handcrafted features are not available. We propose an algorithm that discovers eigenoptions while learning non-linear state representations from raw pixels. It exploits recent successes in the deep reinforcement learning literature and the equivalence between proto-value functions and the successor representation. We use traditional tabular domains to provide intuition about our approach and Atari 2600 games to demonstrate its potential.
We explore deep reinforcement learning methods for multi-agent domains. We begin by analyzing the difficulty of traditional algorithms in the multi-agent case: Q-learning is challenged by an inherent non-stationarity of the environment, while policy gradient suffers from a variance that increases as the number of agents grows. We then present an adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination. Additionally, we introduce a training regimen utilizing an ensemble of policies for each agent that leads to more robust multi-agent policies. We show the strength of our approach compared to existing methods in cooperative as well as competitive scenarios, where agent populations are able to discover various physical and informational coordination strategies.