我的美女教师在线观看免费_精品亚洲高清一区二区三区电影_真人作爱试看120分钟_亚欧中文字幕在W线视频_久久天天狠狠躁夜夜2020首页_第一区精品视频在线观看_国产一区二区亚洲一区二区

We consider a Bayesian persuasion or information design problem where the sender tries to persuade the receiver to take a particular action via a sequence of signals. This we model by considering multi-phase trials with different experiments conducted based on the outcomes of prior experiments. In contrast to most of the literature, we consider the problem with constraints on signals imposed on the sender. This we achieve by fixing some of the experiments in an exogenous manner; these are called determined experiments. This modeling helps us understand real-world situations where this occurs: e.g., multi-phase drug trials where the FDA determines some of the experiments, funding of a startup by a venture capital firm, start-up acquisition by big firms where late-stage assessments are determined by the potential acquirer, multi-round job interviews where the candidates signal initially by presenting their qualifications but the rest of the screening procedures are determined by the interviewer. The non-determined experiments (signals) in the multi-phase trial are to be chosen by the sender in order to persuade the receiver best. With a binary state of the world, we start by deriving the optimal signaling policy in the only non-trivial configuration of a two-phase trial with binary-outcome experiments. We then generalize to multi-phase trials with binary-outcome experiments where the determined experiments can be placed at any chosen node in the trial tree. Here we present a dynamic programming algorithm to derive the optimal signaling policy that uses the two-phase trial solution's structural insights. We also contrast the optimal signaling policy structure with classical Bayesian persuasion strategies to highlight the impact of the signaling constraints on the sender.

相關內容

優化器

關注 4

過采樣 · Integration · 離散化 · CASE · Continuity ·

2022 年 1 月 27 日

An analysis of least-squares oversampled collocation methods for compactly perturbed boundary integral equations in two dimensions

Georg Maierhofer,Daan Huybrechs

In recent work (Maierhofer & Huybrechs, 2022, Adv. Comput. Math.), the authors showed that least-squares oversampling can improve the convergence properties of collocation methods for boundary integral equations involving operators of certain pseudo-differential form. The underlying principle is that the discrete method approximates a Bubnov$-$Galerkin method in a suitable sense. In the present work, we extend this analysis to the case when the integral operator is perturbed by a compact operator $\mathcal{K}$ which is continuous as a map on Sobolev spaces on the boundary, $\mathcal{K}:H^{p}\rightarrow H^{q}$ for all $p,q\in\mathbb{R}$. This study is complicated by the fact that both the test and trial functions in the discrete Bubnov-Galerkin orthogonality conditions are modified over the unperturbed setting. Our analysis guarantees that previous results concerning optimal convergence rates and sufficient rates of oversampling are preserved in the more general case. Indeed, for the first time, this analysis provides a complete explanation of the advantages of least-squares oversampled collocation for boundary integral formulations of the Laplace equation on arbitrary smooth Jordan curves in 2D. Our theoretical results are shown to be in very good agreement with numerical experiments.

值域 · 奇異的 · 離散化 · 穩健性 · 方陣 ·

2022 年 1 月 27 日

GMRES using pseudo-inverse for range symmetric singular systems

Kota Sugihara,Ken Hayami,Liao Zeyu

Consider solving large sparse range symmetric singular linear systems $A{\bf x} = {\bf b}$ which arise, for instance, in the discretization of convection diffusion equations with periodic boundary conditions, and partial differential equations for electromagnetic fields using the edge-based finite element method. In theory, the Generalized Minimal Residual (GMRES) method converges to the least squares solution for inconsistent systems if the coefficient matrix $A$ is range symmetric, i.e. ${\rm R}(A)={ {\rm R}(A^{\rm T} })$, where ${\rm R}(A)$ is the range space of $A$. However, in practice, GMRES may not converge due to numerical instability. In order to improve the convergence, we propose using the pseudo-inverse for the solution of the severely ill-conditioned Hessenberg systems in GMRES. Numerical experiments on semi-definite inconsistent systems indicate that the method is efficient and robust. Finally, we further improve the convergence of the method, by reorthogonalizing the Modified Gram-Schmidt procedure.

非周期的 · 隨機變量 · 矩 · 優化器 · Processing（編程語言） ·

2022 年 1 月 27 日

Distributed gradient-based optimization in the presence of dependent aperiodic communication

Adrian Redder,Arunselvan Ramaswamy,Holger Karl

Iterative distributed optimization algorithms involve multiple agents that communicate with each other, over time, in order to minimize/maximize a global objective. In the presence of unreliable communication networks, the Age-of-Information (AoI), which measures the freshness of data received, may be large and hence hinder algorithmic convergence. In this paper, we study the convergence of general distributed gradient-based optimization algorithms in the presence of communication that neither happens periodically nor at stochastically independent points in time. We show that convergence is guaranteed provided the random variables associated with the AoI processes are stochastically dominated by a random variable with finite first moment. This improves on previous requirements of boundedness of more than the first moment. We then introduce stochastically strongly connected (SSC) networks, a new stochastic form of strong connectedness for time-varying networks. We show: If for any $p \ge0$ the processes that describe the success of communication between agents in a SSC network are $\alpha$-mixing with $n^{p-1}\alpha(n)$ summable, then the associated AoI processes are stochastically dominated by a random variable with finite $p$-th moment. In combination with our first contribution, this implies that distributed stochastic gradient descend converges in the presence of AoI, if $\alpha(n)$ is summable.

因子分析 · 馬爾可夫鏈蒙特卡羅 · 馬爾可夫鏈 · 貝葉斯推斷 · binary ·

2022 年 1 月 26 日

Sequential Bayesian Inference for Factor Analysis

Konstantinos Vamvourellis,Konstantinos Kalogeropoulos,Irini Moustaki

We develop an efficient Bayesian sequential inference framework for factor analysis models observed via various data types, such as continuous, binary and ordinal data. In the continuous data case, where it is possible to marginalise over the latent factors, the proposed methodology tailors the Iterated Batch Importance Sampling (IBIS) of Chopin (2002) to handle such models and we incorporate Hamiltonian Markov Chain Monte Carlo. For binary and ordinal data, we develop an efficient IBIS scheme to handle the parameter and latent factors, combining with Laplace or Variational Bayes approximations. The methodology can be used in the context of sequential hypothesis testing via Bayes factors, which are known to have advantages over traditional null hypothesis testing. Moreover, the developed sequential framework offers multiple benefits even in non-sequential cases, by providing posterior distribution, model evidence and scoring rules (under the prequential framework) in one go, and by offering a more robust alternative computational scheme to Markov Chain Monte Carlo that can be useful in problematic target distributions.

INTERACT · MoDELS · HER · INFORMS · 線性分類 ·

2022 年 1 月 26 日

Predicting Decisions in Language Based Persuasion Games

Reut Apel,Ido Erev,Roi Reichart,Moshe Tennenholtz

from arxiv, Under review for the Journal of Artificial Intelligence Research (JAIR)

Sender-receiver interactions, and specifically persuasion games, are widely researched in economic modeling and artificial intelligence. However, in the classic persuasion games setting, the messages sent from the expert to the decision-maker (DM) are abstract or well-structured signals rather than natural language messages. This paper addresses the use of natural language in persuasion games. For this purpose, we conduct an online repeated interaction experiment. At each trial of the interaction, an informed expert aims to sell an uninformed decision-maker a vacation in a hotel, by sending her a review that describes the hotel. While the expert is exposed to several scored reviews, the decision-maker observes only the single review sent by the expert, and her payoff in case she chooses to take the hotel is a random draw from the review score distribution available to the expert only. We also compare the behavioral patterns in this experiment to the equivalent patterns in similar experiments where the communication is based on the numerical values of the reviews rather than the reviews' text, and observe substantial differences which can be explained through an equilibrium analysis of the game. We consider a number of modeling approaches for our verbal communication setup, differing from each other in the model type (deep neural network vs. linear classifier), the type of features used by the model (textual, behavioral or both) and the source of the textual features (DNN-based vs. hand-crafted). Our results demonstrate that given a prefix of the interaction sequence, our models can predict the future decisions of the decision-maker, particularly when a sequential modeling approach and hand-crafted textual features are applied. Further analysis of the hand-crafted textual features allows us to make initial observations about the aspects of text that drive decision making in our setup

協方差矩陣 · MoDELS · 數據增強 · 吉布斯采樣/吉布斯抽樣 · 潛變量/隱變量 ·

2022 年 1 月 25 日

Bayesian Covariance Structure Modeling of Multi-Way Nested Data

Stef Baas,Richard J. Boucherie,Jean-Paul Fox

from arxiv, 30 pages, 5 figures, 4 tables

A Bayesian multivariate model with a structured covariance matrix for multi-way nested data is proposed. This flexible modeling framework allows for positive and for negative associations among clustered observations, and generalizes the well-known dependence structure implied by random effects. A conjugate shifted-inverse gamma prior is proposed for the covariance parameters which ensures that the covariance matrix remains positive definite under posterior analysis. A numerically efficient Gibbs sampling procedure is defined for balanced nested designs, and is validated using two simulation studies. For a top-layer unbalanced nested design, the procedure requires an additional data augmentation step. The proposed data augmentation procedure facilitates sampling latent variables from (truncated) univariate normal distributions, and avoids numerical computation of the inverse of the structured covariance matrix. The Bayesian multivariate (linear transformation) model is applied to two-way nested interval-censored event times to analyze differences in adverse events between three groups of patients, who were randomly allocated to treatment with different stents (BIO-RESORT). The parameters of the structured covariance matrix represent unobserved heterogeneity in treatment effects and are examined to detect differential treatment effects.

學成 · 強化學習 · 中央處理器 (CPU) · GPU · 訓練樣本 ·

2018 年 10 月 24 日

GPU-Accelerated Robotic Simulation for Distributed Reinforcement Learning

Jacky Liang,Viktor Makoviychuk,Ankur Handa,Nuttapong Chentanez,Miles Macklin,Dieter Fox

from arxiv, Accepted and to appear at the Conference on Robot Learning (CoRL) 2018

Most Deep Reinforcement Learning (Deep RL) algorithms require a prohibitively large number of training samples for learning complex tasks. Many recent works on speeding up Deep RL have focused on distributed training and simulation. While distributed training is often done on the GPU, simulation is not. In this work, we propose using GPU-accelerated RL simulations as an alternative to CPU ones. Using NVIDIA Flex, a GPU-based physics engine, we show promising speed-ups of learning various continuous-control, locomotion tasks. With one GPU and CPU core, we are able to train the Humanoid running task in less than 20 minutes, using 10-1000x fewer CPU cores than previous works. We also demonstrate the scalability of our simulator to multi-GPU settings to train more challenging locomotion tasks.

contrastive · 學成 · MoDELS · 可理解性 · Machine Learning ·

2018 年 7 月 23 日

Contrastive Explanations for Reinforcement Learning in terms of Expected Consequences

Jasper van der Waa,Jurriaan van Diggelen,Karel van den Bosch,Mark Neerincx

from arxiv, XAI workshop on the IJCAI conference 2018, Stockholm, Sweden

Machine Learning models become increasingly proficient in complex tasks. However, even for experts in the field, it can be difficult to understand what the model learned. This hampers trust and acceptance, and it obstructs the possibility to correct the model. There is therefore a need for transparency of machine learning models. The development of transparent classification models has received much attention, but there are few developments for achieving transparent Reinforcement Learning (RL) models. In this study we propose a method that enables a RL agent to explain its behavior in terms of the expected consequences of state transitions and outcomes. First, we define a translation of states and actions to a description that is easier to understand for human users. Second, we developed a procedure that enables the agent to obtain the consequences of a single action, as well as its entire policy. The method calculates contrasts between the consequences of a policy derived from a user query, and of the learned policy of the agent. Third, a format for generating explanations was constructed. A pilot survey study was conducted to explore preferences of users for different explanation properties. Results indicate that human users tend to favor explanations about policy rather than about single actions.

學成 · 控制器 · MoDELS · 在線 · 元學習 ·

2018 年 3 月 30 日

Learning to Adapt: Meta-Learning for Model-Based Control

Ignasi Clavera,Anusha Nagabandi,Ronald S. Fearing,Pieter Abbeel,Sergey Levine,Chelsea Finn

Although reinforcement learning methods can achieve impressive results in simulation, the real world presents two major challenges: generating samples is exceedingly expensive, and unexpected perturbations can cause proficient but narrowly-learned policies to fail at test time. In this work, we propose to learn how to quickly and effectively adapt online to new situations as well as to perturbations. To enable sample-efficient meta-learning, we consider learning online adaptation in the context of model-based reinforcement learning. Our approach trains a global model such that, when combined with recent data, the model can be be rapidly adapted to the local context. Our experiments demonstrate that our approach can enable simulated agents to adapt their behavior online to novel terrains, to a crippled leg, and in highly-dynamic environments.

優化器 · Extensibility · 對偶問題 · 平滑 · INTERACT ·

2017 年 12 月 1 日

Optimal Algorithms for Distributed Optimization

César A. Uribe,Soomin Lee,Alexander Gasnikov,Angelia Nedi?

In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.