Firms and statistical agencies must protect the privacy of the individuals whose data they collect, analyze, and publish. Increasingly, these organizations do so by using publication mechanisms that satisfy differential privacy. We consider the problem of choosing such a mechanism so as to maximize the value of its output to end users. We show that this is a constrained information design problem, and characterize its solution. When the underlying database is drawn from a symmetric distribution -- for instance, if individuals' data are i.i.d. -- we show that the problem's dimensionality can be reduced, and that its solution belongs to a simpler class of mechanisms. When, in addition, data users have supermodular payoffs, we show that the simple geometric mechanism is always optimal by using a novel comparative static that ranks information structures according to their usefulness in supermodular decision problems.
Mobile edge computing (MEC) is a promising paradigm to meet the quality of service (QoS) requirements of latency-sensitive IoT applications. However, attackers may eavesdrop on the offloading decisions to infer the edge server's (ES's) queue information and users' usage patterns, thereby incurring the pattern privacy (PP) issue. Therefore, we propose an offloading strategy which jointly minimizes the latency, ES's energy consumption, and task dropping rate, while preserving PP. Firstly, we formulate the dynamic computation offloading procedure as a Markov decision process (MDP). Next, we develop a Differential Privacy Deep Q-learning based Offloading (DP-DQO) algorithm to solve this problem while addressing the PP issue by injecting noise into the generated offloading decisions. This is achieved by modifying the deep Q-network (DQN) with a Function-output Gaussian process mechanism. We provide a theoretical privacy guarantee and a utility guarantee (learning error bound) for the DP-DQO algorithm and finally, conduct simulations to evaluate the performance of our proposed algorithm by comparing it with greedy and DQN-based algorithms.
Today's online platforms rely heavily on recommendation systems to serve content to their users; social media is a prime example. In turn, recommendation systems largely depend on artificial intelligence algorithms to decide who gets to see what. While the content social media platforms deliver is as varied as the users who engage with them, it has been shown that platforms can contribute to serious harm to individuals, groups and societies. Studies have suggested that these negative impacts range from worsening an individual's mental health to driving society-wide polarisation capable of putting democracies at risk. To better safeguard people from these harms, the European Union's Digital Services Act (DSA) requires platforms, especially those with large numbers of users, to make their algorithmic systems more transparent and follow due diligence obligations. These requirements constitute an important legislative step towards mitigating the systemic risks posed by online platforms. However, the DSA lacks concrete guidelines to operationalise a viable audit process that would allow auditors to hold these platforms accountable. This void could foster the spread of 'audit-washing', that is, platforms exploiting audits to legitimise their practices and neglect responsibility. To fill this gap, we propose a risk-scenario-based audit process. We explain in detail what audits and assessments of recommender systems according to the DSA should look like. Our approach also considers the evolving nature of platforms and emphasises the observability of their recommender systems' components. The resulting audit facilitates internal (among audits of the same system at different moments in time) and external comparability (among audits of different platforms) while also affording the evaluation of mitigation measures implemented by the platforms themselves.
We propose in this paper efficient first/second-order time-stepping schemes for the evolutional Navier-Stokes-Nernst-Planck-Poisson equations. The proposed schemes are constructed using an auxiliary variable reformulation and sophisticated treatment of the terms coupling different equations. By introducing a dynamic equation for the auxiliary variable and reformulating the original equations into an equivalent system, we construct first- and second-order semi-implicit linearized schemes for the underlying problem. The main advantages of the proposed method are: (1) the schemes are unconditionally stable in the sense that a discrete energy keeps decay during the time stepping; (2) the concentration components of the discrete solution preserve positivity and mass conservation; (3) the delicate implementation shows that the proposed schemes can be very efficiently realized, with computational complexity close to a semi-implicit scheme. Some numerical examples are presented to demonstrate the accuracy and performance of the proposed method. As far as the best we know, this is the first second-order method which satisfies all the above properties for the Navier-Stokes-Nernst-Planck-Poisson equations.
Federated learning (FL) is increasingly deployed among multiple clients (e.g., mobile devices) to train a shared model over decentralized data. To address the privacy concerns, FL systems need to protect the clients' data from being revealed during training, and also control the data leakage through trained models when exposed to untrusted domains. Distributed differential privacy (DP) offers an appealing solution in this regard as it achieves an informed tradeoff between privacy and utility without a trusted server. However, existing distributed DP mechanisms work impractically in the presence of client dropout, resulting in either poor privacy guarantees or degraded training accuracy. In addition, these mechanisms also suffer from severe efficiency issues with long time-to-accuracy training performance. We present Hyades, a distributed differentially private FL framework that is highly efficient and resilient to client dropout. Specifically, we develop a novel 'add-then-remove' scheme where a required noise level can be enforced in each FL training round even though some sampled clients may drop out in the end; therefore, the privacy budget is consumed carefully even in the presence of client dropout. To boost performance, Hyades runs as a distributed pipeline architecture via encapsulating the communication and computation operations into stages. It automatically divides the global model aggregation into several chunk-aggregation tasks and pipelines them for optimal speedup. Evaluation through large-scale cloud deployment shows that Hyades can efficiently handle client dropout in various realistic FL scenarios, attaining the optimal privacy-utility tradeoff and accelerating the training by up to 2.1$\times$ compared to existing solutions.
We analyze to what extent final users can infer information about the level of protection of their data when the data obfuscation mechanism is a priori unknown to them (the so-called ''black-box'' scenario). In particular, we delve into the investigation of two notions of local differential privacy (LDP), namely {\epsilon}-LDP and R\'enyi LDP. On one hand, we prove that, without any assumption on the underlying distributions, it is not possible to have an algorithm able to infer the level of data protection with provable guarantees; this result also holds for the central versions of the two notions of DP considered. On the other hand, we demonstrate that, under reasonable assumptions (namely, Lipschitzness of the involved densities on a closed interval), such guarantees exist and can be achieved by a simple histogram-based estimator. We validate our results experimentally and we note that, on a particularly well-behaved distribution (namely, the Laplace noise), our method gives even better results than expected, in the sense that in practice the number of samples needed to achieve the desired confidence is smaller than the theoretical bound, and the estimation of {\epsilon} is more precise than predicted.
Differential privacy is typically ensured by perturbation with additive noise that is sampled from a known distribution. Conventionally, independent and identically distributed (i.i.d.) noise samples are added to each coordinate. In this work, propose to add noise which is independent, but not identically distributed (i.n.i.d.) across the coordinates. In particular, we study the i.n.i.d. Gaussian and Laplace mechanisms and obtain the conditions under which these mechanisms guarantee privacy. The optimal choice of parameters that ensure these conditions are derived theoretically. Theoretical analyses and numerical simulations show that the i.n.i.d. mechanisms achieve higher utility for the given privacy requirements compared to their i.i.d. counterparts.
Deformable image registration (DIR), aiming to find spatial correspondence between images, is one of the most critical problems in the domain of medical image analysis. In this paper, we present a novel, generic, and accurate diffeomorphic image registration framework that utilizes neural ordinary differential equations (NODEs). We model each voxel as a moving particle and consider the set of all voxels in a 3D image as a high-dimensional dynamical system whose trajectory determines the targeted deformation field. Our method leverages deep neural networks for their expressive power in modeling dynamical systems, and simultaneously optimizes for a dynamical system between the image pairs and the corresponding transformation. Our formulation allows various constraints to be imposed along the transformation to maintain desired regularities. Our experiment results show that our method outperforms the benchmarks under various metrics. Additionally, we demonstrate the feasibility to expand our framework to register multiple image sets using a unified form of transformation,which could possibly serve a wider range of applications.
We present an efficient quantum algorithm to simulate nonlinear differential equations with polynomial vector fields of arbitrary degree on quantum platforms. Models of physical systems that are governed by ordinary differential equations (ODEs) or partial differential equation (PDEs) can be challenging to solve on classical computers due to high dimensionality, stiffness, nonlinearities, and sensitive dependence to initial conditions. For sparse $n$-dimensional linear ODEs, quantum algorithms have been developed which can produce a quantum state proportional to the solution in poly(log(nx)) time using the quantum linear systems algorithm (QLSA). Recently, this framework was extended to systems of nonlinear ODEs with quadratic polynomial vector fields by applying Carleman linearization that enables the embedding of the quadratic system into an approximate linear form. A detailed complexity analysis was conducted which showed significant computational advantage under certain conditions. We present an extension of this algorithm to deal with systems of nonlinear ODEs with $k$-th degree polynomial vector fields for arbitrary (finite) values of $k$. The steps involve: 1) mapping the $k$-th degree polynomial ODE to a higher dimensional quadratic polynomial ODE; 2) applying Carleman linearization to transform the quadratic ODE to an infinite-dimensional system of linear ODEs; 3) truncating and discretizing the linear ODE and solving using the forward Euler method and QLSA. Alternatively, one could apply Carleman linearization directly to the $k$-th degree polynomial ODE, resulting in a system of infinite-dimensional linear ODEs, and then apply step 3. This solution route can be computationally more efficient. We present detailed complexity analysis of the proposed algorithms, prove polynomial scaling of runtime on $k$ and demonstrate the framework on an example.
Privacy auditing techniques for differentially private (DP) algorithms are useful for estimating the privacy loss to compare against analytical bounds, or empirically measure privacy in settings where known analytical bounds on the DP loss are not tight. However, existing privacy auditing techniques usually make strong assumptions on the adversary (e.g., knowledge of intermediate model iterates or the training data distribution), are tailored to specific tasks and model architectures, and require retraining the model many times (typically on the order of thousands). These shortcomings make deploying such techniques at scale difficult in practice, especially in federated settings where model training can take days or weeks. In this work, we present a novel "one-shot" approach that can systematically address these challenges, allowing efficient auditing or estimation of the privacy loss of a model during the same, single training run used to fit model parameters. Our privacy auditing method for federated learning does not require a priori knowledge about the model architecture or task. We show that our method provides provably correct estimates for privacy loss under the Gaussian mechanism, and we demonstrate its performance on a well-established FL benchmark dataset under several adversarial models.
The conjoining of dynamical systems and deep learning has become a topic of great interest. In particular, neural differential equations (NDEs) demonstrate that neural networks and differential equation are two sides of the same coin. Traditional parameterised differential equations are a special case. Many popular neural network architectures, such as residual networks and recurrent networks, are discretisations. NDEs are suitable for tackling generative problems, dynamical systems, and time series (particularly in physics, finance, ...) and are thus of interest to both modern machine learning and traditional mathematical modelling. NDEs offer high-capacity function approximation, strong priors on model space, the ability to handle irregular data, memory efficiency, and a wealth of available theory on both sides. This doctoral thesis provides an in-depth survey of the field. Topics include: neural ordinary differential equations (e.g. for hybrid neural/mechanistic modelling of physical systems); neural controlled differential equations (e.g. for learning functions of irregular time series); and neural stochastic differential equations (e.g. to produce generative models capable of representing complex stochastic dynamics, or sampling from complex high-dimensional distributions). Further topics include: numerical methods for NDEs (e.g. reversible differential equations solvers, backpropagation through differential equations, Brownian reconstruction); symbolic regression for dynamical systems (e.g. via regularised evolution); and deep implicit models (e.g. deep equilibrium models, differentiable optimisation). We anticipate this thesis will be of interest to anyone interested in the marriage of deep learning with dynamical systems, and hope it will provide a useful reference for the current state of the art.