日本三级网站在线播放_日韩欧美国产AⅤ另类_高清无码人成电影_国产剧情在线看免费观看_高清视频免费一区二区三区_永久在线中文字幕_精品国产AⅤ一区二区三区V免费

Existing work on differentially private linear regression typically assumes that end users can precisely set data bounds or algorithmic hyperparameters. End users often struggle to meet these requirements without directly examining the data (and violating privacy). Recent work has attempted to develop solutions that shift these burdens from users to algorithms, but they struggle to provide utility as the feature dimension grows. This work extends these algorithms to higher-dimensional problems by introducing a differentially private feature selection method based on Kendall rank correlation. We prove a utility guarantee for the setting where features are normally distributed and conduct experiments across 25 datasets. We find that adding this private feature selection step before regression significantly broadens the applicability of ``plug-and-play'' private linear regression algorithms at little additional cost to privacy, computation, or decision-making by the end user.

知識薈萃

精品入門和進階教程、論文和代碼整理等

查看相關VIP內容、論文、資訊等

MoDELS · 示例 · Machine Learning · 機器學習模型 · Learning ·

2023 年 7 月 21 日

Epsilon*: Privacy Metric for Machine Learning Models

Diana M. Negoescu,Humberto Gonzalez,Saad Eddin Al Orjany,Jilei Yang,Yuliia Lut,Rahul Tandra,Xiaowen Zhang,Xinyi Zheng,Zach Douglas,Vidita Nolkha,Parvez Ahammad,Gennady Samorodnitsky

We introduce Epsilon*, a new privacy metric for measuring the privacy risk of a single model instance prior to, during, or after deployment of privacy mitigation strategies. The metric does not require access to the training data sampling or model training algorithm. Epsilon* is a function of true positive and false positive rates in a hypothesis test used by an adversary in a membership inference attack. We distinguish between quantifying the privacy loss of a trained model instance and quantifying the privacy loss of the training mechanism which produces this model instance. Existing approaches in the privacy auditing literature provide lower bounds for the latter, while our metric provides a lower bound for the former by relying on an (${\epsilon}$,${\delta}$)-type of quantification of the privacy of the trained model instance. We establish a relationship between these lower bounds and show how to implement Epsilon* to avoid numerical and noise amplification instability. We further show in experiments on benchmark public data sets that Epsilon* is sensitive to privacy risk mitigation by training with differential privacy (DP), where the value of Epsilon* is reduced by up to 800% compared to the Epsilon* values of non-DP trained baseline models. This metric allows privacy auditors to be independent of model owners, and enables all decision-makers to visualize the privacy-utility landscape to make informed decisions regarding the trade-offs between model privacy and utility.

Learning · 聯邦學習 · 均值 · tuning · 可約的 ·

2023 年 7 月 20 日

Private Federated Learning with Autotuned Compression

Enayat Ullah,Christopher A. Choquette-Choo,Peter Kairouz,Sewoong Oh

from arxiv, Accepted to ICML 2023

We propose new techniques for reducing communication in private federated learning without the need for setting or tuning compression rates. Our on-the-fly methods automatically adjust the compression rate based on the error induced during training, while maintaining provable privacy guarantees through the use of secure aggregation and differential privacy. Our techniques are provably instance-optimal for mean estimation, meaning that they can adapt to the ``hardness of the problem" with minimal interactivity. We demonstrate the effectiveness of our approach on real-world datasets by achieving favorable compression rates without the need for tuning.

MoDELS · 噪聲 · 數據集 · Learning · 最大平均偏差 ·

2023 年 7 月 20 日

Pre-trained Perceptual Features Improve Differentially Private Image Generation

Fredrik Harder,Milad Jalali Asadabadi,Danica J. Sutherland,Mijung Park

Training even moderately-sized generative models with differentially-private stochastic gradient descent (DP-SGD) is difficult: the required level of noise for reasonable levels of privacy is simply too large. We advocate instead building off a good, relevant representation on an informative public dataset, then learning to model the private data with that representation. In particular, we minimize the maximum mean discrepancy (MMD) between private target data and a generator's distribution, using a kernel based on perceptual features learned from a public dataset. With the MMD, we can simply privatize the data-dependent term once and for all, rather than introducing noise at each step of optimization as in DP-SGD. Our algorithm allows us to generate CIFAR10-level images with $\epsilon \approx 2$ which capture distinctive features in the distribution, far surpassing the current state of the art, which mostly focuses on datasets such as MNIST and FashionMNIST at a large $\epsilon \approx 10$. Our work introduces simple yet powerful foundations for reducing the gap between private and non-private deep generative models. Our code is available at \url{//github.com/ParkLabML/DP-MEPF}.

線性的 · 成比例 · 最優化 · 優化器 · 圖片分類 ·

2023 年 7 月 19 日

The importance of feature preprocessing for differentially private linear optimization

Ziteng Sun,Ananda Theertha Suresh,Aditya Krishna Menon

Training machine learning models with differential privacy (DP) has received increasing interest in recent years. One of the most popular algorithms for training differentially private models is differentially private stochastic gradient descent (DPSGD) and its variants, where at each step gradients are clipped and combined with some noise. Given the increasing usage of DPSGD, we ask the question: is DPSGD alone sufficient to find a good minimizer for every dataset under privacy constraints? As a first step towards answering this question, we show that even for the simple case of linear classification, unlike non-private optimization, (private) feature preprocessing is vital for differentially private optimization. In detail, we first show theoretically that there exists an example where without feature preprocessing, DPSGD incurs a privacy error proportional to the maximum norm of features over all samples. We then propose an algorithm called DPSGD-F, which combines DPSGD with feature preprocessing and prove that for classification tasks, it incurs a privacy error proportional to the diameter of the features $\max_{x, x' \in D} \|x - x'\|_2$. We then demonstrate the practicality of our algorithm on image classification benchmarks.

MoDELS · state-of-the-art · Performer · 可理解性 · 變換 ·

2023 年 7 月 19 日

DP-TBART: A Transformer-based Autoregressive Model for Differentially Private Tabular Data Generation

Rodrigo Castellon,Achintya Gopal,Brian Bloniarz,David Rosenberg

The generation of synthetic tabular data that preserves differential privacy is a problem of growing importance. While traditional marginal-based methods have achieved impressive results, recent work has shown that deep learning-based approaches tend to lag behind. In this work, we present Differentially-Private TaBular AutoRegressive Transformer (DP-TBART), a transformer-based autoregressive model that maintains differential privacy and achieves performance competitive with marginal-based methods on a wide variety of datasets, capable of even outperforming state-of-the-art methods in certain settings. We also provide a theoretical framework for understanding the limitations of marginal-based approaches and where deep learning-based approaches stand to contribute most. These results suggest that deep learning-based techniques should be considered as a viable alternative to marginal-based methods in the generation of differentially private synthetic tabular data.

估計/估計量 · 變換 · 3D · SimPLe · MoDELS ·

2023 年 7 月 19 日

IST-Net: Prior-free Category-level Pose Estimation with Implicit Space Transformation

Jianhui Liu,Yukang Chen,Xiaoqing Ye,Xiaojuan Qi

from arxiv, Accepted by ICCV2023

Category-level 6D pose estimation aims to predict the poses and sizes of unseen objects from a specific category. Thanks to prior deformation, which explicitly adapts a category-specific 3D prior (i.e., a 3D template) to a given object instance, prior-based methods attained great success and have become a major research stream. However, obtaining category-specific priors requires collecting a large amount of 3D models, which is labor-consuming and often not accessible in practice. This motivates us to investigate whether priors are necessary to make prior-based methods effective. Our empirical study shows that the 3D prior itself is not the credit to the high performance. The keypoint actually is the explicit deformation process, which aligns camera and world coordinates supervised by world-space 3D models (also called canonical space). Inspired by these observations, we introduce a simple prior-free implicit space transformation network, namely IST-Net, to transform camera-space features to world-space counterparts and build correspondence between them in an implicit manner without relying on 3D priors. Besides, we design camera- and world-space enhancers to enrich the features with pose-sensitive information and geometrical constraints, respectively. Albeit simple, IST-Net achieves state-of-the-art performance based-on prior-free design, with top inference speed on the REAL275 benchmark. Our code and models are available at //github.com/CVMI-Lab/IST-Net.

向量化 · Learning · Machine Learning · 情景 · 服務器 ·

2023 年 7 月 19 日

Efficient Secure Aggregation for Privacy-Preserving Federated Machine Learning

Rouzbeh Behnia,Mohammadreza Ebrahimi,Arman Riasi,Sherman Chow,Balaji Padmanabhan,Thang Hoang

from arxiv, This paper was submitted to ACM AISec'23

Secure aggregation protocols ensure the privacy of users' data in the federated learning settings by preventing the disclosure of users' local gradients. Despite their merits, existing aggregation protocols often incur high communication and computation overheads on the participants and might not be optimized to handle the large update vectors for machine learning models efficiently. This paper presents e-SeaFL, an efficient, verifiable secure aggregation protocol taking one communication round in aggregation. e-SeaFL allows the aggregation server to generate proof of honest aggregation for the participants. Our core idea is to employ a set of assisting nodes to help the aggregation server, under similar trust assumptions existing works placed upon the participating users. For verifiability, e-SeaFL uses authenticated homomorphic vector commitments. Our experiments show that the user enjoys five orders of magnitude higher efficiency than the state of the art (PPML 2022) for a gradient vector of a high dimension up to $100,000$.

Agent · 相互獨立的 · CASE · 泛函 · Facebook AI Research ·

2023 年 7 月 18 日

On the Existence of Envy-Free Allocations Beyond Additive Valuations

Gerdus Benadè,Daniel Halpern,Alexandros Psomas,Paritosh Verma

We study the problem of fairly allocating $m$ indivisible items among $n$ agents. Envy-free allocations, in which each agent prefers her bundle to the bundle of every other agent, need not exist in the worst case. However, when agents have additive preferences and the value $v_{i,j}$ of agent $i$ for item $j$ is drawn independently from a distribution $D_i$, envy-free allocations exist with high probability when $m \in \Omega( n \log n / \log \log n )$. In this paper, we study the existence of envy-free allocations under stochastic valuations far beyond the additive setting. We introduce a new stochastic model in which each agent's valuation is sampled by first fixing a worst-case function, and then drawing a uniformly random renaming of the items, independently for each agent. This strictly generalizes known settings; for example, $v_{i,j} \sim D_i$ may be seen as picking a random (instead of a worst-case) additive function before renaming. We prove that random renaming is sufficient to ensure that envy-free allocations exist with high probability in very general settings. When valuations are non-negative and ``order-consistent,'' a valuation class that generalizes additive, budget-additive, unit-demand, and single-minded agents, SD-envy-free allocations (a stronger notion of fairness than envy-freeness) exist for $m \in \omega(n^2)$ when $n$ divides $m$, and SD-EFX allocations exist for all $m \in \omega(n^2)$. The dependence on $n$ is tight, that is, for $m \in O(n^2)$ envy-free allocations don't exist with constant probability. For the case of arbitrary valuations (allowing non-monotone, negative, or mixed-manna valuations) and $n=2$ agents, we prove envy-free allocations exist with probability $1 - \Theta(1/m)$ (and this is tight).

網絡表示學習 · Networking · 表示學習 · 學成 · 無監督 ·

2020 年 3 月 11 日

A Comparative Study for Unsupervised Network Representation Learning

Megha Khosla,Vinay Setty,Avishek Anand

from arxiv, Accepted for publication in IEEE TKDE

There has been appreciable progress in unsupervised network representation learning (UNRL) approaches over graphs recently with flexible random-walk approaches, new optimization objectives and deep architectures. However, there is no common ground for systematic comparison of embeddings to understand their behavior for different graphs and tasks. In this paper we theoretically group different approaches under a unifying framework and empirically investigate the effectiveness of different network representation methods. In particular, we argue that most of the UNRL approaches either explicitly or implicit model and exploit context information of a node. Consequently, we propose a framework that casts a variety of approaches -- random walk based, matrix factorization and deep learning based -- into a unified context-based optimization function. We systematically group the methods based on their similarities and differences. We study the differences among these methods in detail which we later use to explain their performance differences (on downstream tasks). We conduct a large-scale empirical study considering 9 popular and recent UNRL techniques and 11 real-world datasets with varying structural properties and two common tasks -- node classification and link prediction. We find that there is no single method that is a clear winner and that the choice of a suitable method is dictated by certain properties of the embedding methods, task and structural properties of the underlying graph. In addition we also report the common pitfalls in evaluation of UNRL methods and come up with suggestions for experimental design and interpretation of results.

數據增強 · 泛化理論 · 矩 · 規范化的 · surge ·

2020 年 2 月 25 日

On Feature Normalization and Data Augmentation

Boyi Li,Felix Wu,Ser-Nam Lim,Serge Belongie,Kilian Q. Weinberger

Modern neural network training relies heavily on data augmentation for improved generalization. After the initial success of label-preserving augmentations, there has been a recent surge of interest in label-perturbing approaches, which combine features and labels across training samples to smooth the learned decision surface. In this paper, we propose a new augmentation method that leverages the first and second moments extracted and re-injected by feature normalization. We replace the moments of the learned features of one training image by those of another, and also interpolate the target labels. As our approach is fast, operates entirely in feature space, and mixes different signals than prior methods, one can effectively combine it with existing augmentation methods. We demonstrate its efficacy across benchmark data sets in computer vision, speech, and natural language processing, where it consistently improves the generalization performance of highly competitive baseline networks.