We study properties of confidence intervals (CIs) for the difference of two Bernoulli distributions' success parameters, $p_x - p_y$, in the case where the goal is to obtain a CI of a given half-width while minimizing sampling costs when the observation costs may be different between the two distributions. Assuming that we are provided with preliminary estimates of the success parameters, we propose three different methods for constructing fixed-width CIs: (i) a two-stage sampling procedure, (ii) a sequential method that carries out sampling in batches, and (iii) an $\ell$-stage "look-ahead" procedure. We use Monte Carlo simulation to show that, under diverse success probability and observation cost scenarios, our proposed algorithms obtain significant cost savings versus their baseline counterparts (up to 50\% for the two-stage procedure, up to 15\% for the sequential methods). Furthermore, for the battery of scenarios under study, our sequential-batches and $\ell$-stage "look-ahead" procedures approximately obtain the nominal coverage while also meeting the desired width requirement. Our sequential-batching method turned out to be more efficient than the "look-ahead" method from a computational standpoint, with average running times at least an order-of-magnitude faster over all the scenarios tested.
We give an improved theoretical analysis of score-based generative modeling. Under a score estimate with small $L^2$ error (averaged across timesteps), we provide efficient convergence guarantees for any data distribution with second-order moment, by either employing early stopping or assuming smoothness condition on the score function of the data distribution. Our result does not rely on any log-concavity or functional inequality assumption and has a logarithmic dependence on the smoothness. In particular, we show that under only a finite second moment condition, approximating the following in reverse KL divergence in $\epsilon$-accuracy can be done in $\tilde O\left(\frac{d \log (1/\delta)}{\epsilon}\right)$ steps: 1) the variance-$\delta$ Gaussian perturbation of any data distribution; 2) data distributions with $1/\delta$-smooth score functions. Our analysis also provides a quantitative comparison between different discrete approximations and may guide the choice of discretization points in practice.
This paper considers Bayesian persuasion for routing games where information about the uncertain state of the network is provided by a traffic information system (TIS) using public signals. In this setup, the TIS commits to a signalling scheme and participants form a posterior belief about the state of the network based on prior beliefs and the received signal. They subsequently select routes minimizing their individual expected travel time under their posterior beliefs, giving rise to a Wardrop equilibrium. We investigate how the TIS can infer the prior beliefs held by the participants by designing suitable signalling schemes, and observing the equilibrium flows under different signals. We show that under mild conditions a signalling scheme that allows for exact inference of the prior exists. We then provide an iterative algorithm that finds such a scheme in a finite number of steps. We show that schemes designed by our algorithm are robust, in the sense that they can still identify the prior after a small enough perturbation. We also investigate the case where the population is divided among multiple priors, and give conditions under which the fraction associated to each prior can be identified. Several examples illustrate our results.
Purpose: Development of a novel interactive visualization approach for the exploration of radiotherapy treatment plans with a focus on overlap volumes with the aim of healthy tissue sparing. Methods: We propose a visualization approach to include overlap volumes in the radiotherapy treatment plan evaluation process. Quantitative properties can be interactively explored to identify critical regions and used to steer the visualization for a detailed inspection of candidates. We evaluated our approach with a user study covering the individual visualizations and their interactions regarding helpfulness, comprehensibility, intuitiveness, decision-making and speed. Results: A user study with three domain experts was conducted using our software and evaluating five data sets each representing a different type of cancer and location by performing a set of tasks and filling out a questionnaire. The results show that the visualizations and interactions help to identify and evaluate overlap volumes according to their physical and dose properties. Furthermore, the task of finding dose hot spots can also benefit from our approach. Conclusions: The results indicate the potential to enhance the current treatment plan evaluation process in terms of healthy tissue sparing.
The ability to accurately predict the opponent's behavior is central to the safety and efficiency of robotic systems in interactive settings, such as human-robot interaction and multi-robot teaming tasks. Unfortunately, robots often lack access to key information on which these predictions may hinge, such as opponent's goals, attention, and willingness to cooperate. Dual control theory addresses this challenge by treating unknown parameters of a predictive model as hidden states and inferring their values at runtime using information gathered during system operation. While able to optimally and automatically trade off exploration and exploitation, dual control is computationally intractable for general interactive motion planning. In this paper, we present a novel algorithmic approach to enable active uncertainty reduction for interactive motion planning based on the implicit dual control paradigm. Our approach relies on sampling-based approximation of stochastic dynamic programming, leading to a model predictive control problem. The resulting policy is shown to preserve the dual control effect for a broad class of predictive models with both continuous and categorical uncertainty. To ensure the safe operation of the interacting agents, we leverage a supervisory control scheme, oftentimes referred to as ``shielding'', which overrides the ego agent's dual control policy with a safety fallback strategy when a safety-critical event is imminent. We then augment the dual control framework with an improved variant of the recently proposed shielding-aware robust planning scheme, which proactively balances the nominal planning performance with the risk of high-cost emergency maneuvers triggered by low-probability opponent's behaviors. We demonstrate the efficacy of our approach with both simulated driving examples and hardware experiments using 1/10 scale autonomous vehicles.
We consider two simple asynchronous opinion dynamics on arbitrary graphs where each node $u$ of the graph has an initial value $\xi_u(0)$. In the first process, the $NodeModel$, at each time step $t\ge 0$, a random node $u$ and a random sample of $k$ of its neighbours $v_1,v_2,\cdots,v_k$ are selected. Then $u$ updates its current value $\xi_u(t)$ to $\xi_u(t+1)=\alpha\xi_u(t)+\frac{(1-\alpha)}{k}\sum_{i=1}^k\xi_{v_i}(t)$, where $\alpha\in(0,1)$ and $k\ge1$ are parameters of the process. In the second process, the $EdgeModel$, at each step a random edge $(u,v)$ is selected. Node $u$ updates its value equivalently to the $NodeModel$ with $k=1$ and $v$ as the selected neighbour. For both processes the values of all nodes converge to the same value $F$, which is a random variable depending on the random choices made in each step. For the $NodeModel$ and regular graphs, and for the $EdgeModel$ and arbitrary graphs, the expectation of $F$ is the average of the initial values $\frac{1}{n}\sum_{u\in V}\xi_u(0)$. For the $NodeModel$ and non-regular graphs, the expectation of $F$ is the degree-weighted average of the initial values. Our results are two-fold. We consider the concentration of $F$ and show tight bounds on the variance of $F$ for regular graphs. We show that when the initial load does not depend on the number of nodes, the variance is negligible and the nodes are able to estimate the initial average of the node values. Interestingly, this variance does not depend on the graph structure. For the proof we introduce a duality between our processes and a process of two correlated random walks. We also analyse the convergence time for both models and for arbitrary graphs, showing bounds on the time $T_\varepsilon$ needed to make all node values `$\varepsilon$-close' to each other. Our bounds are asymptotically tight under some assumptions on the distribution of the starting values.
We propose a novel Bayesian inference framework for distributed differentially private linear regression. We consider a distributed setting where multiple parties hold parts of the data and share certain summary statistics of their portions in privacy-preserving noise. We develop a novel generative statistical model for privately shared statistics, which exploits a useful distributional relation between the summary statistics of linear regression. Bayesian estimation of the regression coefficients is conducted mainly using Markov chain Monte Carlo algorithms, while we also provide a fast version to perform Bayesian estimation in one iteration. The proposed methods have computational advantages over their competitors. We provide numerical results on both real and simulated data, which demonstrate that the proposed algorithms provide well-rounded estimation and prediction.
This paper presents a new convergent Plug-and-Play (PnP) algorithm. PnP methods are efficient iterative algorithms for solving image inverse problems formulated as the minimization of the sum of a data-fidelity term and a regularization term. PnP methods perform regularization by plugging a pre-trained denoiser in a proximal algorithm, such as Proximal Gradient Descent (PGD). To ensure convergence of PnP schemes, many works study specific parametrizations of deep denoisers. However, existing results require either unverifiable or suboptimal hypotheses on the denoiser, or assume restrictive conditions on the parameters of the inverse problem. Observing that these limitations can be due to the proximal algorithm in use, we study a relaxed version of the PGD algorithm for minimizing the sum of a convex function and a weakly convex one. When plugged with a relaxed proximal denoiser, we show that the proposed PnP-$\alpha$PGD algorithm converges for a wider range of regularization parameters, thus allowing more accurate image restoration.
We present an efficient robust value iteration for \texttt{s}-rectangular robust Markov Decision Processes (MDPs) with a time complexity comparable to standard (non-robust) MDPs which is significantly faster than any existing method. We do so by deriving the optimal robust Bellman operator in concrete forms using our $L_p$ water filling lemma. We unveil the exact form of the optimal policies, which turn out to be novel threshold policies with the probability of playing an action proportional to its advantage.
Given a dataset on actions and resulting long-term rewards, a direct estimation approach fits value functions that minimize prediction error on the training data. Temporal difference learning (TD) methods instead fit value functions by minimizing the degree of temporal inconsistency between estimates made at successive time-steps. Focusing on finite state Markov chains, we provide a crisp asymptotic theory of the statistical advantages of this approach. First, we show that an intuitive inverse trajectory pooling coefficient completely characterizes the percent reduction in mean-squared error of value estimates. Depending on problem structure, the reduction could be enormous or nonexistent. Next, we prove that there can be dramatic improvements in estimates of the difference in value-to-go for two states: TD's errors are bounded in terms of a novel measure - the problem's trajectory crossing time - which can be much smaller than the problem's time horizon.
A cyber range is an environment used for training security experts and testing attack and defence tools and procedures. Usually, a cyber range simulates one or more critical infrastructures that attacking (red) and defending (blue) teams must compromise and protect, respectively. The infrastructure can be physically assembled, but much more convenient is to rely on the Infrastructure as a Service (IaaS) paradigm. Although some modern technologies support the IaaS, the design and deployment of scenarios of interest is mostly a manual operation. As a consequence, it is a common practice to have a cyber range hosting few (sometimes only one), consolidated scenarios. However, reusing the same scenario may significantly reduce the effectiveness of the training and testing sessions. In this paper, we propose a framework for automating the definition and deployment of arbitrarily complex cyber range scenarios. The framework relies on the virtual scenario description language (VSDL), i.e., a domain-specific language for defining high-level features of the desired infrastructure while hiding low-level details. The semantics of VSDL is given in terms of constraints that must be satisfied by the virtual infrastructure. These constraints are then submitted to an SMT solver for checking the satisfiability of the specification. If satisfiable, the specification gives rise to a model that is automatically converted to a set of deployment scripts to be submitted to the IaaS provider.