露脸视频一区二区三区在线播放_一区二区三区免费观看在线视频播放_亚洲免费一区二区三区在线观看_五月丁香啪啪激情综合色九色_免费国产成高清人在线视频_国产精品久久久精品_好爽毛片一区二区三区四无码视色

Differential privacy is a strong mathematical notion of privacy. Still, a prominent challenge when using differential privacy in real data collection is understanding and counteracting the accuracy loss that differential privacy imposes. As such, the accuracy/privacy trade-off of differential privacy needs to be balanced on a case-by-case basis. Applications in the literature tend to focus solely on analytical accuracy bounds, not include data in error prediction, or use arbitrary settings to measure error empirically. To fill the gap in the literature, we propose a novel application of factor experiments to create data aware error predictions. Basically, factor experiments provide a systematic approach to conducting empirical experiments. To demonstrate our methodology in action, we conduct a case study where error is dependent on arbitrarily complex tree structures. We first construct a tool to simulate poll data. Next, we use our simulated data to construct a least squares model to predict error. Last, we show how to validate the model. Consequently, our contribution is a method for constructing error prediction models that are data aware.

相關內容

模(mo)型評(ping)估

關注 1730

機器學習(xi)系統(tong)設計系統(tong)評估標準

統計量 · 均值 · 自助法/自舉法 · TOOLS · 推斷 ·

2021 年 10 月 15 日

Multivariate Mean Comparison under Differential Privacy

Martin Dunsche,Tim Kutta,Holger Dette

The comparison of multivariate population means is a central task of statistical inference. While statistical theory provides a variety of analysis tools, they usually do not protect individuals' privacy. This knowledge can create incentives for participants in a study to conceal their true data (especially for outliers), which might result in a distorted analysis. In this paper we address this problem by developing a hypothesis test for multivariate mean comparisons that guarantees differential privacy to users. The test statistic is based on the popular Hotelling's $t^2$-statistic, which has a natural interpretation in terms of the Mahalanobis distance. In order to control the type-1-error, we present a bootstrap algorithm under differential privacy that provably yields a reliable test decision. In an empirical study we demonstrate the applicability of this approach.

貪心逐層預訓練 · 優化器 · MoDELS · 相互獨立的 · 相關系數 ·

2021 年 10 月 14 日

Mechanisms for Hiding Sensitive Genotypes with Information-Theoretic Privacy

Fangwei Ye,Hyunghoon Cho,Salim El Rouayheb

Motivated by the growing availability of personal genomics services, we study an information-theoretic privacy problem that arises when sharing genomic data: a user wants to share his or her genome sequence while keeping the genotypes at certain positions hidden, which could otherwise reveal critical health-related information. A straightforward solution of erasing (masking) the chosen genotypes does not ensure privacy, because the correlation between nearby positions can leak the masked genotypes. We introduce an erasure-based privacy mechanism with perfect information-theoretic privacy, whereby the released sequence is statistically independent of the sensitive genotypes. Our mechanism can be interpreted as a locally-optimal greedy algorithm for a given processing order of sequence positions, where utility is measured by the number of positions released without erasure. We show that finding an optimal order is NP-hard in general and provide an upper bound on the optimal utility. For sequences from hidden Markov models, a standard modeling approach in genetics, we propose an efficient algorithmic implementation of our mechanism with complexity polynomial in sequence length. Moreover, we illustrate the robustness of the mechanism by bounding the privacy leakage from erroneous prior distributions. Our work is a step towards more rigorous control of privacy in genomic data sharing.

聯邦學習 · 學成 · 可約的 · 測試誤差 · Performer ·

2021 年 10 月 6 日

Differentially Private Federated Learning via Inexact ADMM

Minseok Ryu,Kibaek Kim

from arxiv, 20 pages, 6 figures

Differential privacy (DP) techniques can be applied to the federated learning model to protect data privacy against inference attacks to communication among the learning agents. The DP techniques, however, hinder achieving a greater learning performance while ensuring strong data privacy. In this paper we develop a DP inexact alternating direction method of multipliers algorithm that solves a sequence of subproblems with the objective perturbation by random noises generated from a Laplace distribution. We show that our algorithm provides $\bar{\epsilon}$-DP for every iteration, where $\bar{\epsilon}$ is a privacy parameter controlled by a user. Using MNIST and FEMNIST datasets for the image classification, we demonstrate that our algorithm reduces the testing error by at most $22\%$ compared with the existing DP algorithm, while achieving the same level of data privacy. The numerical experiment also shows that our algorithm converges faster than the existing algorithm.

聯邦學習 · 學成 · Extensibility · Networking · 可約的 ·

2021 年 10 月 6 日

Efficient and Private Federated Learning with Partially Trainable Networks

Hakim Sidahmed,Zheng Xu,Ankush Garg,Yuan Cao,Mingqing Chen

Federated learning is used for decentralized training of machine learning models on a large number (millions) of edge mobile devices. It is challenging because mobile devices often have limited communication bandwidth and local computation resources. Therefore, improving the efficiency of federated learning is critical for scalability and usability. In this paper, we propose to leverage partially trainable neural networks, which freeze a portion of the model parameters during the entire training process, to reduce the communication cost with little implications on model performance. Through extensive experiments, we empirically show that Federated learning of Partially Trainable neural networks (FedPT) can result in superior communication-accuracy trade-offs, with up to $46\times$ reduction in communication cost, at a small accuracy cost. Our approach also enables faster training, with a smaller memory footprint, and better utility for strong differential privacy guarantees. The proposed FedPT method can be particularly interesting for pushing the limitations of overparameterization in on-device learning.

Conformer · 估計/估計量 · 自助法/自舉法 · Extensibility · Performer ·

2021 年 9 月 29 日

Conformal prediction interval for dynamic time-series

Chen Xu,Yao Xie

from arxiv, An earlier conference version was accepted as a long talk/oral (3% of total submissions) in the Proceedings of the 38th International Conference on Machine Learning, PMLR 139, 2021 (ICML 2021). The current version significantly updates the earlier one and is being prepared for journal submission

We develop a method to construct distribution-free prediction intervals for dynamic time-series, called EnbPI that wraps around any bootstrap ensemble estimator to construct sequential prediction intervals. EnbPI is closely related to the conformal prediction (CP) framework but does not require data exchangeability. Theoretically, these intervals attain asymptotically valid marginal coverage for broad classes of regression functions and time-series with certain dependencies. Intervals also converge in width to the oracle lower bound asymptotically. Computationally, EnbPI avoids overfitting and requires neither data-splitting nor training multiple ensemble estimators; it efficiently aggregates bootstrap estimators that have been trained. In general, EnbPI is easy to implement, scalable to producing arbitrarily many prediction intervals sequentially, and well-suited to a wide range of regression functions. We perform extensive simulation and real-data analyses to demonstrate its effectiveness and applicability beyond predictive inference.

Extensibility · 圖 · 知識圖譜 · entity · 可交換的 ·

2021 年 8 月 16 日

Differentially Private Federated Knowledge Graphs Embedding

Hao Peng,Haoran Li,Yangqiu Song,Vincent Zheng,Jianxin Li

from arxiv, Conference paper accepted by CIKM 2021

Knowledge graph embedding plays an important role in knowledge representation, reasoning, and data mining applications. However, for multiple cross-domain knowledge graphs, state-of-the-art embedding models cannot make full use of the data from different knowledge domains while preserving the privacy of exchanged data. In addition, the centralized embedding model may not scale to the extensive real-world knowledge graphs. Therefore, we propose a novel decentralized scalable learning framework, \emph{Federated Knowledge Graphs Embedding} (FKGE), where embeddings from different knowledge graphs can be learnt in an asynchronous and peer-to-peer manner while being privacy-preserving. FKGE exploits adversarial generation between pairs of knowledge graphs to translate identical entities and relations of different domains into near embedding spaces. In order to protect the privacy of the training data, FKGE further implements a privacy-preserving neural network structure to guarantee no raw data leakage. We conduct extensive experiments to evaluate FKGE on 11 knowledge graphs, demonstrating a significant and consistent improvement in model quality with at most 17.85\% and 7.90\% increases in performance on triple classification and link prediction tasks.

Performance · 優化器 · 方差 · 回合 · 進化計算 ·

2021 年 7 月 15 日

Death in Genetic Algorithms

Micah Burkhardt,Roman V. Yampolskiy

Death has long been overlooked in evolutionary algorithms. Recent research has shown that death (when applied properly) can benefit the overall fitness of a population and can outperform sub-sections of a population that are "immortal" when allowed to evolve together in an environment [1]. In this paper, we strive to experimentally determine whether death is an adapted trait and whether this adaptation can be used to enhance our implementations of conventional genetic algorithms. Using some of the most widely accepted evolutionary death and aging theories, we observed that senescent death (in various forms) can lower the total run-time of genetic algorithms, increase the optimality of a solution, and decrease the variance in an algorithm's performance. We believe that death-enhanced genetic algorithms can accomplish this through their unique ability to backtrack out of and/or avoid getting trapped in local optima altogether.

簇 · 示例 · 總回報 · 代價 · 計算學習理論 ·

2021 年 6 月 11 日

Differentially Private Algorithms for Clustering with Stability Assumptions

Moshe Shechner

from arxiv, Thesis submitted in partial fulfillment of the requirements for the M.Sc. degree in the Faculty of Natural Sciences. arXiv admin note: text overlap with arXiv:1907.02513 by other authors

We study the problem of differentially private clustering under input-stability assumptions. Despite the ever-growing volume of works on differential privacy in general and differentially private clustering in particular, only three works (Nissim et al. 2007, Wang et al. 2015, Huang et al. 2018) looked at the problem of privately clustering "nice" k-means instances, all three relying on the sample-and-aggregate framework and all three measuring utility in terms of Wasserstein distance between the true cluster centers and the centers returned by the private algorithm. In this work we improve upon this line of works on multiple axes. We present a far simpler algorithm for clustering stable inputs (not relying on the sample-and-aggregate framework), and analyze its utility in both the Wasserstein distance and the k-means cost. Moreover, our algorithm has straight-forward analogues for "nice" k-median instances and for the local-model of differential privacy.

穩健性 · MoDELS · Continuity · Taxonomy · 聯邦學習 ·

2020 年 12 月 7 日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Lingjuan Lyu,Han Yu,Xingjun Ma,Lichao Sun,Jun Zhao,Qiang Yang,Philip S. Yu

from arxiv, arXiv admin note: text overlap with arXiv:2003.02133; text overlap with arXiv:1911.11815 by other authors

As data are increasingly being stored in different silos and societies becoming more aware of data privacy issues, the traditional centralized training of artificial intelligence (AI) models is facing efficiency and privacy challenges. Recently, federated learning (FL) has emerged as an alternative solution and continue to thrive in this new reality. Existing FL protocol design has been shown to be vulnerable to adversaries within or outside of the system, compromising data privacy and system robustness. Besides training powerful global models, it is of paramount importance to design FL systems that have privacy guarantees and are resistant to different types of adversaries. In this paper, we conduct the first comprehensive survey on this topic. Through a concise introduction to the concept of FL, and a unique taxonomy covering: 1) threat models; 2) poisoning attacks and defenses against robustness; 3) inference attacks and defenses against privacy, we provide an accessible review of this important topic. We highlight the intuitions, key techniques as well as fundamental assumptions adopted by various attacks and defenses. Finally, we discuss promising future research directions towards robust and privacy-preserving federated learning.

聯邦學習 · 學成 · INFORMS · 方差 · MoDELS ·

2020 年 7 月 31 日

LDP-FL: Practical Private Aggregation in Federated Learning with Local Differential Privacy

Lichao Sun,Jianwei Qian,Xun Chen,Philip S. Yu

Train machine learning models on sensitive user data has raised increasing privacy concerns in many areas. Federated learning is a popular approach for privacy protection that collects the local gradient information instead of real data. One way to achieve a strict privacy guarantee is to apply local differential privacy into federated learning. However, previous works do not give a practical solution due to three issues. First, the noisy data is close to its original value with high probability, increasing the risk of information exposure. Second, a large variance is introduced to the estimated average, causing poor accuracy. Last, the privacy budget explodes due to the high dimensionality of weights in deep learning models. In this paper, we proposed a novel design of local differential privacy mechanism for federated learning to address the abovementioned issues. It is capable of making the data more distinct from its original value and introducing lower variance. Moreover, the proposed mechanism bypasses the curse of dimensionality by splitting and shuffling model updates. A series of empirical evaluations on three commonly used datasets, MNIST, Fashion-MNIST and CIFAR-10, demonstrate that our solution can not only achieve superior deep learning performance but also provide a strong privacy guarantee at the same time.