To address the privacy leakage problem in decentralized composite convex optimization, we proposes a novel differentially private decentralized primal--dual algorithm named DP-RECAL with operator splitting method and relay communication mechanism. We study the relationship between communication and privacy leakage, thus defining a new measure: local communication involvement (LCI). To the best of our knowledge, compared with existing differentially private algorithms, DP-RECAL is the first to take advantage of the relay communication mechanism to experience less LCI so as to reduce the overall privacy budget. In addition, we prove that DP-RECAL is convergent with uncoordinated network-independent stepsizes and establish the linear convergence rate of DP-RECAL under metric subregularity. Furthermore, taking the least squares problem as an example, DP-RECAL presents better privacy performance and communication complexity than listed differentially private decentralized algorithms. Numerical experiments on real-world datasets verify our analysis results and demonstrate that DP-RECAL can defend deep leakage from gradients (DLG) attacks.
Reconfigurable intelligent surface (RIS) devices have emerged as an effective way to control the propagation channels for enhancing the end users' performance. However, RIS optimization involves configuring the radio frequency (RF) response of a large number of radiating elements, which is challenging in real-world applications due to high computational complexity. In this paper, a model-free cross-entropy (CE) algorithm is proposed to optimize the binary RIS configuration for improving the signal-to-noise ratio (SNR) at the receiver. One key advantage of the proposed method is that it only needs system performance parameters, e.g., the received SNR, without the need for channel models or channel estimation. Both simulations and experiments are conducted to evaluate the performance of the proposed CE algorithm. The results demonstrate that the CE algorithm outperforms benchmark algorithms, and shows stronger channel hardening with increasing numbers of RIS elements.
We consider a federated data analytics problem in which a server coordinates the collaborative data analysis of multiple users with privacy concerns and limited communication capability. The commonly adopted compression schemes introduce information loss into local data while improving communication efficiency, and it remains an open question whether such discrete-valued mechanisms provide any privacy protection. Considering that differential privacy has become the gold standard for privacy measures due to its simple implementation and rigorous theoretical foundation, in this paper, we study the privacy guarantees of discrete-valued mechanisms with finite output space in the lens of $f$-differential privacy (DP). By interpreting the privacy leakage as a hypothesis testing problem, we derive the closed-form expression of the tradeoff between type I and type II error rates, based on which the $f$-DP guarantees of a variety of discrete-valued mechanisms, including binomial mechanisms, sign-based methods, and ternary-based compressors, are characterized. We further investigate the Byzantine resilience of binomial mechanisms and ternary compressors and characterize the tradeoff among differential privacy, Byzantine resilience, and communication efficiency. Finally, we discuss the application of the proposed method to differentially private stochastic gradient descent in federated learning.
Data collection is indispensable for spatial crowdsourcing services, such as resource allocation, policymaking, and scientific explorations. However, privacy issues make it challenging for users to share their information unless receiving sufficient compensation. Differential Privacy (DP) is a promising mechanism to release helpful information while protecting individuals' privacy. However, most DP mechanisms only consider a fixed compensation for each user's privacy loss. In this paper, we design a task assignment scheme that allows workers to dynamically improve their utility with dynamic distance privacy leakage. Specifically, we propose two solutions to improve the total utility of task assignment results, namely Private Utility Conflict-Elimination (PUCE) approach and Private Game Theory (PGT) approach, respectively. We prove that PUCE achieves higher utility than the state-of-the-art works. We demonstrate the efficiency and effectiveness of our PUCE and PGT approaches on both real and synthetic data sets compared with the recent distance-based approach, Private Distance Conflict-Elimination (PDCE). PUCE is always better than PDCE slightly. PGT is 50% to 63% faster than PDCE and can improve 16% utility on average when worker range is large enough.
The minimax optimization over Riemannian manifolds (possibly nonconvex constraints) has been actively applied to solve many problems, such as robust dimensionality reduction and deep neural networks with orthogonal weights (Stiefel manifold). Although many optimization algorithms for minimax problems have been developed in the Euclidean setting, it is difficult to convert them into Riemannian cases, and algorithms for nonconvex minimax problems with nonconvex constraints are even rare. On the other hand, to address the big data challenges, decentralized (serverless) training techniques have recently been emerging since they can reduce communications overhead and avoid the bottleneck problem on the server node. Nonetheless, the algorithm for decentralized Riemannian minimax problems has not been studied. In this paper, we study the distributed nonconvex-strongly-concave minimax optimization problem over the Stiefel manifold and propose both deterministic and stochastic minimax methods. The Steifel manifold is a non-convex set. The global function is represented as the finite sum of local functions. For the deterministic setting, we propose DRGDA and prove that our deterministic method achieves a gradient complexity of $O( \epsilon^{-2})$ under mild conditions. For the stochastic setting, we propose DRSGDA and prove that our stochastic method achieves a gradient complexity of $O(\epsilon^{-4})$. The DRGDA and DRSGDA are the first algorithms for distributed minimax optimization with nonconvex constraints with exact convergence. Extensive experimental results on the Deep Neural Networks (DNNs) training over the Stiefel manifold demonstrate the efficiency of our algorithms.
Bayesian neural network (BNN) allows for uncertainty quantification in prediction, offering an advantage over regular neural networks that has not been explored in the differential privacy (DP) framework. We fill this important gap by leveraging recent development in Bayesian deep learning and privacy accounting to offer a more precise analysis of the trade-off between privacy and accuracy in BNN. We propose three DP-BNNs that characterize the weight uncertainty for the same network architecture in distinct ways, namely DP-SGLD (via the noisy gradient method), DP-BBP (via changing the parameters of interest) and DP-MC Dropout (via the model architecture). Interestingly, we show a new equivalence between DP-SGD and DP-SGLD, implying that some non-Bayesian DP training naturally allows for uncertainty quantification. However, the hyperparameters such as learning rate and batch size, can have different or even opposite effects in DP-SGD and DP-SGLD. Extensive experiments are conducted to compare DP-BNNs, in terms of privacy guarantee, prediction accuracy, uncertainty quantification, calibration, computation speed, and generalizability to network architecture. As a result, we observe a new tradeoff between the privacy and the reliability. When compared to non-DP and non-Bayesian approaches, DP-SGLD is remarkably accurate under strong privacy guarantee, demonstrating the great potential of DP-BNN in real-world tasks.
Federated Learning (FL) is a nascent decentralized learning framework under which a massive collection of heterogeneous clients collaboratively train a model without revealing their local data. Scarce communication, privacy leakage, and Byzantine attacks are the key bottlenecks of system scalability. In this paper, we focus on communication-efficient distributed (stochastic) gradient descent for non-convex optimization, a driving force of FL. We propose two algorithms, named {\em Adaptive Stochastic Sign SGD (Ada-StoSign)} and {\em $\beta$-Stochastic Sign SGD ($\beta$-StoSign)}, each of which compresses the local gradients into bit vectors. To handle unbounded gradients, Ada-StoSign uses a novel norm tracking function that adaptively adjusts a coarse estimation on the $\ell_{\infty}$ of the local gradients - a key parameter used in gradient compression. We show that Ada-StoSign converges in expectation with a rate $O(\log T/\sqrt{T} + 1/\sqrt{M})$, where $M$ is the number of clients. To the best of our knowledge, when $M$ is sufficiently large, Ada-StoSign outperforms the state-of-the-art sign-based method whose convergence rate is $O(T^{-1/4})$. Under bounded gradient assumption, $\beta$-StoSign achieves quantifiable Byzantine resilience and privacy assurances, and works with partial client participation and mini-batch gradients which could be unbounded. We corroborate and complement our theories by experiments on MNIST and CIFAR-10 datasets.
Machine learning (ML) can help fight pandemics like COVID-19 by enabling rapid screening of large volumes of images. To perform data analysis while maintaining patient privacy, we create ML models that satisfy Differential Privacy (DP). Previous works exploring private COVID-19 models are in part based on small datasets, provide weaker or unclear privacy guarantees, and do not investigate practical privacy. We suggest improvements to address these open gaps. We account for inherent class imbalances and evaluate the utility-privacy trade-off more extensively and over stricter privacy budgets. Our evaluation is supported by empirically estimating practical privacy through black-box Membership Inference Attacks (MIAs). The introduced DP should help limit leakage threats posed by MIAs, and our practical analysis is the first to test this hypothesis on the COVID-19 classification task. Our results indicate that needed privacy levels might differ based on the task-dependent practical threat from MIAs. The results further suggest that with increasing DP guarantees, empirical privacy leakage only improves marginally, and DP therefore appears to have a limited impact on practical MIA defense. Our findings identify possibilities for better utility-privacy trade-offs, and we believe that empirical attack-specific privacy estimation can play a vital role in tuning for practical privacy.
Federated Learning (FL) is a novel distributed machine learning approach to leverage data from Internet of Things (IoT) devices while maintaining data privacy. However, the current FL algorithms face the challenges of non-independent and identically distributed (non-IID) data, which causes high communication costs and model accuracy declines. To address the statistical imbalances in FL, we propose a clustered data sharing framework which spares the partial data from cluster heads to credible associates through device-to-device (D2D) communication. Moreover, aiming at diluting the data skew on nodes, we formulate the joint clustering and data sharing problem based on the privacy-preserving constrained graph. To tackle the serious coupling of decisions on the graph, we devise a distribution-based adaptive clustering algorithm (DACA) basing on three deductive cluster-forming conditions, which ensures the maximum yield of data sharing. The experiments show that the proposed framework facilitates FL on non-IID datasets with better convergence and model accuracy under a limited communication environment.
Continuous surveillance of a spatial region using distributed robots and sensors is a well-studied application in the area of multi-agent systems. This paper investigates a practically-relevant scenario where robotic sensors are introduced asynchronously and inter-robot communication is discrete, event-driven, local and asynchronous. Furthermore, we work with lazy robots; i.e., the robots seek to minimize their area of responsibility by equipartitioning the domain to be covered. We adapt a well-known algorithm which is practicable and known to generally work well for coverage problems. For a specially chosen geometry of the spatial domain, we show that there exists a non-trivial sequence of inter-robot communication events which leads to an instantaneous loss of coverage when the number of robots exceeds a certain threshold. The same sequence of events preserves coverage and, further, leads to an equipartition of the domain when the number of robots is smaller than the threshold. This result demonstrates that coverage guarantees for a given algorithm might be sensitive to the number of robots and, therefore, may not scale in obvious ways. It also suggests that when such algorithms are to be verified and validated prior to field deployment, the number of robots or sensors used in test scenarios should match that deployed on the field.
Effective multi-robot teams require the ability to move to goals in complex environments in order to address real-world applications such as search and rescue. Multi-robot teams should be able to operate in a completely decentralized manner, with individual robot team members being capable of acting without explicit communication between neighbors. In this paper, we propose a novel game theoretic model that enables decentralized and communication-free navigation to a goal position. Robots each play their own distributed game by estimating the behavior of their local teammates in order to identify behaviors that move them in the direction of the goal, while also avoiding obstacles and maintaining team cohesion without collisions. We prove theoretically that generated actions approach a Nash equilibrium, which also corresponds to an optimal strategy identified for each robot. We show through extensive simulations that our approach enables decentralized and communication-free navigation by a multi-robot system to a goal position, and is able to avoid obstacles and collisions, maintain connectivity, and respond robustly to sensor noise.