亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We present an approach to quantify and compare the privacy-accuracy trade-off for differentially private Variational Autoencoders. Our work complements previous work in two aspects. First, we evaluate the the strong reconstruction MI attack against Variational Autoencoders under differential privacy. Second, we address the data scientist's challenge of setting privacy parameter epsilon, which steers the differential privacy strength and thus also the privacy-accuracy trade-off. In our experimental study we consider image and time series data, and three local and central differential privacy mechanisms. We find that the privacy-accuracy trade-offs strongly depend on the dataset and model architecture. We do rarely observe favorable privacy-accuracy trade-off for Variational Autoencoders, and identify a case where LDP outperforms CDP.

相關內容

自動編碼器是一種人工神經網絡,用于以無監督的方式學習有效的數據編碼。自動編碼器的目的是通過訓練網絡忽略信號“噪聲”來學習一組數據的表示(編碼),通常用于降維。與簡化方面一起,學習了重構方面,在此,自動編碼器嘗試從簡化編碼中生成盡可能接近其原始輸入的表示形式,從而得到其名稱。基本模型存在幾種變體,其目的是迫使學習的輸入表示形式具有有用的屬性。自動編碼器可有效地解決許多應用問題,從面部識別到獲取單詞的語義。

Differentially private stochastic gradient descent (DP-SGD) is the workhorse algorithm for recent advances in private deep learning. It provides a single privacy guarantee to all datapoints in the dataset. We propose an efficient algorithm to compute per-instance privacy guarantees for individual examples when running DP-SGD. We use our algorithm to investigate per-instance privacy losses across a number of datasets. We find that most examples enjoy stronger privacy guarantees than the worst-case bounds. We further discover that the loss and the privacy loss on an example are well-correlated. This implies groups that are underserved in terms of model utility are simultaneously underserved in terms of privacy loss. For example, on CIFAR-10, the average $\epsilon$ of the class with the highest loss (Cat) is 32% higher than that of the class with the lowest loss (Ship). We also run membership inference attacks to show this reflects disparate empirical privacy risks.

When neural network model and data are outsourced to cloud server for inference, it is desired to preserve the confidentiality of model and data as the involved parties (i.e., cloud server, model providing client and data providing client) may not trust mutually. Solutions were proposed based on multi-party computation, trusted execution environment (TEE) and leveled or fully homomorphic encryption (LHE/FHE), but their limitations hamper practical application. We propose a new framework based on synergistic integration of LHE and TEE, which enables collaboration among mutually-untrusted three parties, while minimizing the involvement of (relatively) resource-constrained TEE and allowing the full utilization of the untrusted but more resource-rich part of server. We also propose a generic and efficient LHE-based inference scheme as an important performance-determining component of the framework. We implemented/evaluated the proposed system on a moderate platform and show that, our proposed scheme is more applicable/scalable to various settings, and has better performance, compared to the state-of-the-art LHE-based solutions.

Shortly after it was first introduced in 2006, differential privacy became the flagship data privacy definition. Since then, numerous variants and extensions were proposed to adapt it to different scenarios and attacker models. In this work, we propose a systematic taxonomy of these variants and extensions. We list all data privacy definitions based on differential privacy, and partition them into seven categories, depending on which aspect of the original definition is modified. These categories act like dimensions: variants from the same category cannot be combined, but variants from different categories can be combined to form new definitions. We also establish a partial ordering of relative strength between these notions by summarizing existing results. Furthermore, we list which of these definitions satisfy some desirable properties, like composition, post-processing, and convexity by either providing a novel proof or collecting existing ones.

Recent papers have shown that large pre-trained language models (LLMs) such as BERT, GPT-2 can be fine-tuned on private data to achieve performance comparable to non-private models for many downstream Natural Language Processing (NLP) tasks while simultaneously guaranteeing differential privacy. The inference cost of these models -- which consist of hundreds of millions of parameters -- however, can be prohibitively large. Hence, often in practice, LLMs are compressed before they are deployed in specific applications. In this paper, we initiate the study of differentially private model compression and propose frameworks for achieving 50% sparsity levels while maintaining nearly full performance. We demonstrate these ideas on standard GLUE benchmarks using BERT models, setting benchmarks for future research on this topic.

Differentially private SGD (DPSGD) has recently shown promise in deep learning. However, compared to non-private SGD, the DPSGD algorithm places computational overheads that can undo the benefit of batching in GPUs. Micro-batching is a common method to alleviate this and is fully supported in the TensorFlow Privacy library (TFDP). However, it degrades accuracy. We propose NanoBatch Privacy, a lightweight add-on to TFDP to be used on Graphcore IPUs by leveraging batch size of 1 (without microbatching) and gradient accumulation. This allows us to achieve large total batch sizes with minimal impacts to throughput. Second, we illustrate using Cifar-10 how larger batch sizes are not necessarily optimal from a privacy versus utility perspective. On ImageNet, we achieve more than 15x speedup over TFDP versus 8x A100s and significant speedups even across libraries such as Opacus. We also provide two extensions: 1) DPSGD for pipelined models and 2) per-layer clipping that is 15x faster than the Opacus implementation on 8x A100s. Finally as an application case study, we apply NanoBatch training for use on private Covid-19 chest CT prediction.

Physical Unclonable Functions (PUFs) are promising security primitives for resource-constrained network nodes. The XOR Arbiter PUF (XOR PUF or XPUF) is an intensively studied PUF invented to improve the security of the Arbiter PUF, probably the most lightweight delay-based PUF. Recently, highly powerful machine learning attack methods were discovered and were able to easily break large-sized XPUFs, which were highly secure against earlier machine learning attack methods. Component-differentially-challenged XPUFs (CDC-XPUFs) are XPUFs with different component PUFs receiving different challenges. Studies showed they were much more secure against machine learning attacks than the conventional XPUFs, whose component PUFs receive the same challenge. But these studies were all based on earlier machine learning attack methods, and hence it is not clear if CDC-XPUFs can remain secure under the recently discovered powerful attack methods. In this paper, the two current most powerful two machine learning methods for attacking XPUFs are adapted by fine-tuning the parameters of the two methods for CDC-XPUFs. Attack experiments using both simulated PUF data and silicon data generated from PUFs implemented on field-programmable gate array (FPGA) were carried out, and the experimental results showed that some previously secure CDC-XPUFs of certain circuit parameter values are no longer secure under the adapted new attack methods, while many more CDC-XPUFs of other circuit parameter values remain secure. Thus, our experimental attack study has re-defined the boundary between the secure region and the insecure region of the PUF circuit parameter space, providing PUF manufacturers and IoT security application developers with valuable information in choosing PUFs with secure parameter values.

As data are increasingly being stored in different silos and societies becoming more aware of data privacy issues, the traditional centralized training of artificial intelligence (AI) models is facing efficiency and privacy challenges. Recently, federated learning (FL) has emerged as an alternative solution and continue to thrive in this new reality. Existing FL protocol design has been shown to be vulnerable to adversaries within or outside of the system, compromising data privacy and system robustness. Besides training powerful global models, it is of paramount importance to design FL systems that have privacy guarantees and are resistant to different types of adversaries. In this paper, we conduct the first comprehensive survey on this topic. Through a concise introduction to the concept of FL, and a unique taxonomy covering: 1) threat models; 2) poisoning attacks and defenses against robustness; 3) inference attacks and defenses against privacy, we provide an accessible review of this important topic. We highlight the intuitions, key techniques as well as fundamental assumptions adopted by various attacks and defenses. Finally, we discuss promising future research directions towards robust and privacy-preserving federated learning.

Generative Adversarial Networks (GANs) have recently achieved impressive results for many real-world applications, and many GAN variants have emerged with improvements in sample quality and training stability. However, they have not been well visualized or understood. How does a GAN represent our visual world internally? What causes the artifacts in GAN results? How do architectural choices affect GAN learning? Answering such questions could enable us to develop new insights and better models. In this work, we present an analytic framework to visualize and understand GANs at the unit-, object-, and scene-level. We first identify a group of interpretable units that are closely related to object concepts using a segmentation-based network dissection method. Then, we quantify the causal effect of interpretable units by measuring the ability of interventions to control objects in the output. We examine the contextual relationship between these units and their surroundings by inserting the discovered object concepts into new images. We show several practical applications enabled by our framework, from comparing internal representations across different layers, models, and datasets, to improving GANs by locating and removing artifact-causing units, to interactively manipulating objects in a scene. We provide open source interpretation tools to help researchers and practitioners better understand their GAN models.

Inferring missing links in knowledge graphs (KG) has attracted a lot of attention from the research community. In this paper, we tackle a practical query answering task involving predicting the relation of a given entity pair. We frame this prediction problem as an inference problem in a probabilistic graphical model and aim at resolving it from a variational inference perspective. In order to model the relation between the query entity pair, we assume that there exists an underlying latent variable (paths connecting two nodes) in the KG, which carries the equivalent semantics of their relations. However, due to the intractability of connections in large KGs, we propose to use variation inference to maximize the evidence lower bound. More specifically, our framework (\textsc{Diva}) is composed of three modules, i.e. a posterior approximator, a prior (path finder), and a likelihood (path reasoner). By using variational inference, we are able to incorporate them closely into a unified architecture and jointly optimize them to perform KG reasoning. With active interactions among these sub-modules, \textsc{Diva} is better at handling noise and coping with more complex reasoning scenarios. In order to evaluate our method, we conduct the experiment of the link prediction task on multiple datasets and achieve state-of-the-art performances on both datasets.

We introduce an effective model to overcome the problem of mode collapse when training Generative Adversarial Networks (GAN). Firstly, we propose a new generator objective that finds it better to tackle mode collapse. And, we apply an independent Autoencoders (AE) to constrain the generator and consider its reconstructed samples as "real" samples to slow down the convergence of discriminator that enables to reduce the gradient vanishing problem and stabilize the model. Secondly, from mappings between latent and data spaces provided by AE, we further regularize AE by the relative distance between the latent and data samples to explicitly prevent the generator falling into mode collapse setting. This idea comes when we find a new way to visualize the mode collapse on MNIST dataset. To the best of our knowledge, our method is the first to propose and apply successfully the relative distance of latent and data samples for stabilizing GAN. Thirdly, our proposed model, namely Generative Adversarial Autoencoder Networks (GAAN), is stable and has suffered from neither gradient vanishing nor mode collapse issues, as empirically demonstrated on synthetic, MNIST, MNIST-1K, CelebA and CIFAR-10 datasets. Experimental results show that our method can approximate well multi-modal distribution and achieve better results than state-of-the-art methods on these benchmark datasets. Our model implementation is published here: //github.com/tntrung/gaan

北京阿比特科技有限公司