Differential privacy is typically ensured by perturbation with additive noise that is sampled from a known distribution. Conventionally, independent and identically distributed (i.i.d.) noise samples are added to each coordinate. In this work, propose to add noise which is independent, but not identically distributed (i.n.i.d.) across the coordinates. In particular, we study the i.n.i.d. Gaussian and Laplace mechanisms and obtain the conditions under which these mechanisms guarantee privacy. The optimal choice of parameters that ensure these conditions are derived theoretically. Theoretical analyses and numerical simulations show that the i.n.i.d. mechanisms achieve higher utility for the given privacy requirements compared to their i.i.d. counterparts.
We address the problem of synthesizing distorting mechanisms that maximize infinite horizon privacy for Networked Control Systems (NCSs). We consider stochastic LTI systems where information about the system state is obtained through noisy sensor measurements and transmitted to a (possibly adversarial) remote station via unsecured/public communication networks to compute control actions (a remote LQR controller). Because the network/station is untrustworthy, adversaries might access sensor and control data and estimate the system state. To mitigate this risk, we pass sensor and control data through distorting (privacy-preserving) mechanisms before transmission and send the distorted data through the communication network. These mechanisms consist of a linear coordinate transformation and additive-dependent Gaussian vectors. We formulate the synthesis of the distorting mechanisms as a convex program. In this convex program, we minimize the infinite horizon mutual information (our privacy metric) between the system state and its optimal estimate at the remote station for a desired upper bound on the control performance degradation (LQR cost) induced by the distortion mechanism.
We study the concurrent composition properties of interactive differentially private mechanisms, whereby an adversary can arbitrarily interleave its queries to the different mechanisms. We prove that all composition theorems for non-interactive differentially private mechanisms extend to the concurrent composition of interactive differentially private mechanisms, whenever differential privacy is measured using the hypothesis testing framework of $f$-DP, which captures standard $(\eps,\delta)$-DP as a special case. We prove the concurrent composition theorem by showing that every interactive $f$-DP mechanism can be simulated by interactive post-processing of a non-interactive $f$-DP mechanism. In concurrent and independent work, Lyu~\cite{lyu2022composition} proves a similar result to ours for $(\eps,\delta)$-DP, as well as a concurrent composition theorem for R\'enyi DP. We also provide a simple proof of Lyu's concurrent composition theorem for R\'enyi DP. Lyu leaves the general case of $f$-DP as an open problem, which we solve in this paper.
When training a machine learning model with differential privacy, one sets a privacy budget. This budget represents a maximal privacy violation that any user is willing to face by contributing their data to the training set. We argue that this approach is limited because different users may have different privacy expectations. Thus, setting a uniform privacy budget across all points may be overly conservative for some users or, conversely, not sufficiently protective for others. In this paper, we capture these preferences through individualized privacy budgets. To demonstrate their practicality, we introduce a variant of Differentially Private Stochastic Gradient Descent (DP-SGD) which supports such individualized budgets. DP-SGD is the canonical approach to training models with differential privacy. We modify its data sampling and gradient noising mechanisms to arrive at our approach, which we call Individualized DP-SGD (IDP-SGD). Because IDP-SGD provides privacy guarantees tailored to the preferences of individual users and their data points, we find it empirically improves privacy-utility trade-offs.
The Gaussian mixed-effects model driven by a stationary integrated Ornstein-Uhlenbeck process has been used for analyzing longitudinal data having an explicit and simple serial-correlation structure in each individual. However, the theoretical aspect of its asymptotic inference is yet to be elucidated. We prove the local asymptotics for the associated log-likelihood function, which in particular guarantees the asymptotic optimality of the suitably chosen maximum-likelihood estimator. We illustrate the obtained asymptotic normality result through some simulations for both balanced and unbalanced datasets.
Long-run covariance matrix estimation is the building block of time series inference problems. The corresponding difference-based estimator, which avoids detrending, has attracted considerable interest due to its robustness to both smooth and abrupt structural breaks and its competitive finite sample performance. However, existing methods mainly focus on estimators for the univariate process while their direct and multivariate extensions for most linear models are asymptotically biased. We propose a novel difference-based and debiased long-run covariance matrix estimator for functional linear models with time-varying regression coefficients, allowing time series non-stationarity, long-range dependence, state-heteroscedasticity and their mixtures. We apply the new estimator to i) the structural stability test, overcoming the notorious non-monotonic power phenomena caused by piecewise smooth alternatives for regression coefficients, and (ii) the nonparametric residual-based tests for long memory, improving the performance via the residual-free formula of the proposed estimator. The effectiveness of the proposed method is justified theoretically and demonstrated by superior performance in simulation studies, while its usefulness is elaborated by means of real data analysis.
To accurately make adaptation decisions, a self-adaptive system needs precise means to analyze itself at runtime. To this end, runtime verification can be used in the feedback loop to check that the managed system satisfies its requirements formalized as temporal-logic properties. These requirements, however, may change due to system evolution or uncertainty in the environment, managed system, and requirements themselves. Thus, the properties under investigation by the runtime verification have to be dynamically adapted to represent the changing requirements while preserving the knowledge about requirements satisfaction gathered thus far, all with minimal latency. To address this need, we present a runtime verification approach for self-adaptive systems with changing requirements. Our approach uses property specification patterns to automatically obtain automata with precise semantics that are the basis for runtime verification. The automata can be safely adapted during runtime verification while preserving intermediate verification results to seamlessly reflect requirement changes and enable continuous verification. We evaluate our approach on an Arduino prototype of the Body Sensor Network and the Timescales benchmark. Results show that our approach is over five times faster than the typical approach of redeploying and restarting runtime monitors to reflect requirements changes, while improving the system's trustworthiness by avoiding interruptions of verification.
This tutorial studies relationships between differential privacy and various information-theoretic measures by using several selective articles. In particular, we present how these connections can provide new interpretations for the privacy guarantee in systems that deploy differential privacy in an information-theoretic framework. To this end, the tutorial provides an extensive summary on the existing literature that makes use of information-theoretic measures and tools such as mutual information, min-entropy, Kullback-Leibler divergence and rate-distortion function for quantification and characterization of differential privacy in various settings.
The estimation of the generalization error of classifiers often relies on a validation set. Such a set is hardly available in few-shot learning scenarios, a highly disregarded shortcoming in the field. In these scenarios, it is common to rely on features extracted from pre-trained neural networks combined with distance-based classifiers such as nearest class mean. In this work, we introduce a Gaussian model of the feature distribution. By estimating the parameters of this model, we are able to predict the generalization error on new classification tasks with few samples. We observe that accurate distance estimates between class-conditional densities are the key to accurate estimates of the generalization performance. Therefore, we propose an unbiased estimator for these distances and integrate it in our numerical analysis. We empirically show that our approach outperforms alternatives such as the leave-one-out cross-validation strategy.
Preferential Bayesian optimization (PBO) is a framework for optimizing a decision maker's latent utility function using preference feedback. This work introduces the expected utility of the best option (qEUBO) as a novel acquisition function for PBO. When the decision maker's responses are noise-free, we show that qEUBO is one-step Bayes optimal and thus equivalent to the popular knowledge gradient acquisition function. We also show that qEUBO enjoys an additive constant approximation guarantee to the one-step Bayes-optimal policy when the decision maker's responses are corrupted by noise. We provide an extensive evaluation of qEUBO and demonstrate that it outperforms the state-of-the-art acquisition functions for PBO across many settings. Finally, we show that, under sufficient regularity conditions, qEUBO's Bayesian simple regret converges to zero at a rate $o(1/n)$ as the number of queries, $n$, goes to infinity. In contrast, we show that simple regret under qEI, a popular acquisition function for standard BO often used for PBO, can fail to converge to zero. Enjoying superior performance, simple computation, and a grounded decision-theoretic justification, qEUBO is a promising acquisition function for PBO.
Residual networks (ResNets) have displayed impressive results in pattern recognition and, recently, have garnered considerable theoretical interest due to a perceived link with neural ordinary differential equations (neural ODEs). This link relies on the convergence of network weights to a smooth function as the number of layers increases. We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature. Depending on certain features of the network architecture, such as the smoothness of the activation function, one may obtain an alternative ODE limit, a stochastic differential equation or neither of these. These findings cast doubts on the validity of the neural ODE model as an adequate asymptotic description of deep ResNets and point to an alternative class of differential equations as a better description of the deep network limit.