两个人的视频免费国语版_日韩毛片天天视频_中文字幕中文字幕在线中一区_秋霞一区二区三区_国产乱人视频在线观看播放_中文字幕无码在线播放_久久久久久人妻精品系列

Homomorphic encryption (HE), which allows computations on encrypted data, is an enabling technology for confidential cloud computing. One notable example is privacy-preserving Prediction-as-a-Service (PaaS), where machine-learning predictions are computed on encrypted data. However, developing HE-based solutions for encrypted PaaS is a tedious task which requires a careful design that predominantly depends on the deployment scenario and on leveraging the characteristics of modern HE schemes. Prior works on privacy-preserving PaaS focus solely on protecting the confidentiality of the client data uploaded to a remote model provider, e.g., a cloud offering a prediction API, and assume (or take advantage of the fact) that the model is held in plaintext. Furthermore, their aim is to either minimize the latency of the service by processing one sample at a time, or to maximize the number of samples processed per second, while processing a fixed (large) number of samples. In this work, we present slytHErin, an agile framework that enables privacy-preserving PaaS beyond the application scenarios considered in prior works. Thanks to its hybrid design leveraging HE and its multiparty variant (MHE), slytHErin enables novel PaaS scenarios by encrypting the data, the model or both. Moreover, slytHErin features a flexible input data packing approach that allows processing a batch of an arbitrary number of samples, and several computation optimizations that are model-and-setting-agnostic. slytHErin is implemented in Go and it allows end-users to perform encrypted PaaS on custom deep learning models comprising fully-connected, convolutional, and pooling layers, in a few lines of code and without having to worry about the cumbersome implementation and optimization concerns inherent to HE.

相關內容

PaaS

關注 181

PaaS是Platform-as-a-Service的(de)(de)縮(suo)寫(xie)，意思是平(ping)臺即服(fu)(fu)務(wu)(wu)。把(ba)服(fu)(fu)務(wu)(wu)器平(ping)臺作(zuo)(zuo)為(wei)一種服(fu)(fu)務(wu)(wu)提供的(de)(de)商業(ye)模(mo)式。通過網絡(luo)進行程序提供的(de)(de)服(fu)(fu)務(wu)(wu)稱之為(wei)SaaS(Software as a Service)，而云計算時代(dai)相應的(de)(de)服(fu)(fu)務(wu)(wu)器平(ping)臺或(huo)者開(kai)發環境作(zuo)(zuo)為(wei)服(fu)(fu)務(wu)(wu)進行提供就成為(wei)了PaaS(Platform as a Service)。

流形 · 泛函 · 推斷 · 情景 · 可約的 ·

2023 年 6 月 13 日

Local inference for functional data on manifold domains using permutation tests

Niels Lundtorp Olsen,Alessia Pini,Simone Vantini

Pini and Vantini (2017) introduced the interval-wise testing procedure which performs local inference for functional data defined on an interval domain, where the output is an adjusted p-value function that controls for type I errors. We extend this idea to a general setting where domain is a Riemannian manifolds. This requires new methodology such as how to define adjustment sets on product manifolds and how to approximate the test statistic when the domain has non-zero curvature. We propose to use permutation tests for inference and apply the procedure in three settings: a simulation on a "chameleon-shaped" manifold and two applications related to climate change where the manifolds are a complex subset of $S^2$ and $S^2 \times S^1$, respectively. We note the tradeoff between type I and type II errors: increasing the adjustment set reduces the type I error but also results in smaller areas of significance. However, some areas still remain significant even at maximal adjustment.

近似 · 分解的 · Less · Better · 可行 ·

2023 年 6 月 11 日

Compact enumeration for scheduling one machine

Nodari Vakhania

We develop an algorithmic framework that finds an optimal solution by enumerating some feasible solutions, which number is bounded by a specially derived Variable Parameter (VP) with a favorable asymptotic behavior. We build a VP algorithm for a strongly $\mathsf{NP}$-hard single-machine scheduling problem. The target VP $\nu$ is the number of jobs with some special properties, the so-called emerging jobs. At phase 1 a partial solution including $n-\nu$ non-emerging jobs is constructed in a low degree polynomial time. At phase 2 less than $\nu!$ permutations of the $\nu$ emerging jobs are considered, each of them being incorporated into the partial schedule of phase 1. Based on an earlier conducted experimental study, in practice, $\nu/n$ varied from $1/4$ for small problem instances to $1/10$ for the largest tested instances. We illustrate how the proposed method can be used to build a polynomial-time approximation scheme (PTAS) with the worst-case time complexity $O(\kappa!\kappa k n \log n)$, where $\kappa$, $\kappa<\nu< n$, is a VP and the corresponding approximation factor is $1+1/k$, with $k\kappa<k$. This is better than the time complexity of the earlier known approximation schemes. Using an intuitive probabilistic model, we give more realistic bounds on the running time of the VP algorithm and the PTAS, which are far below the worst-case bounds $\nu!$ and $\kappa!$.

跳躍連接 · Performer · 推斷 · 樣例 · Learning ·

2023 年 6 月 11 日

Efficient Skip Connections Realization for Secure Inference on Encrypted Data

Nir Drucker,Itamar Zimerman

Homomorphic Encryption (HE) is a cryptographic tool that allows performing computation under encryption, which is used by many privacy-preserving machine learning solutions, for example, to perform secure classification. Modern deep learning applications yield good performance for example in image processing tasks benchmarks by including many skip connections. The latter appears to be very costly when attempting to execute model inference under HE. In this paper, we show that by replacing (mid-term) skip connections with (short-term) Dirac parameterization and (long-term) shared-source skip connection we were able to reduce the skip connections burden for HE-based solutions, achieving x1.3 computing power improvement for the same accuracy.

話題模型 · MoDELS · 話題 · state-of-the-art · 近似 ·

2023 年 6 月 11 日

Federated Neural Topic Models

Lorena Calvo-Bartolomé,Jerónimo Arenas-García

from arxiv, 14 pages, 4 figures

Over the last years, topic modeling has emerged as a powerful technique for organizing and summarizing big collections of documents or searching for particular patterns in them. However, privacy concerns may arise when cross-analyzing data from different sources. Federated topic modeling solves this issue by allowing multiple parties to jointly train a topic model without sharing their data. While several federated approximations of classical topic models do exist, no research has been conducted on their application for neural topic models. To fill this gap, we propose and analyze a federated implementation based on state-of-the-art neural topic modeling implementations, showing its benefits when there is a diversity of topics across the nodes' documents and the need to build a joint model. In practice, our approach is equivalent to a centralized model training, but preserves the privacy of the nodes. Advantages of this federated scenario are illustrated by means of experiments using both synthetic and real data scenarios.

Performer · Learning · Microsoft Surface · 模型評估 · 通道 ·

2023 年 6 月 11 日

Performance-Oriented Design for Intelligent Reflecting Surface Assisted Federated Learning

Yapeng Zhao,Qingqing Wu,Wen Chen,Celimuge Wu,H. Vincent Poor

To efficiently exploit the massive amounts of raw data that are increasingly being generated in mobile edge networks, federated learning (FL) has emerged as a promising distributed learning technique. By collaboratively training a shared learning model on edge devices, raw data transmission and storage are replaced by the exchange of the local computed parameters/gradients in FL, which thus helps address latency and privacy issues. However, the number of resource blocks when using traditional orthogonal transmission strategies for FL linearly scales with the number of participating devices, which conflicts with the scarcity of communication resources. To tackle this issue, over-the-air computation (AirComp) has emerged recently which leverages the inherent superposition property of wireless channels to perform one-shot model aggregation. However, the aggregation accuracy in AirComp suffers from the unfavorable wireless propagation environment. In this paper, we consider the use of intelligent reflecting surfaces (IRSs) to mitigate this problem and improve FL performance with AirComp. Specifically, a performance-oriented design scheme that directly minimizes the optimality gap of the loss function is proposed to accelerate the convergence of AirComp-based FL. We first analyze the convergence behavior of the FL procedure with the absence of channel fading and noise. Based on the obtained optimality gap which characterizes the impact of channel fading and noise in different communication rounds on the ultimate performance of FL, we propose both online and offline approaches to tackle the resulting design problem. Simulation results demonstrate that such a performance-oriented design strategy can achieve higher test accuracy than the conventional isolated mean square error (MSE) minimization approach in FL.

推斷 · 服務器 · MoDELS · Less · 模型評估 ·

2023 年 6 月 10 日

Fusion: Efficient and Secure Inference Resilient to Malicious Servers

Caiqin Dong,Jian Weng,Jia-Nan Liu,Yue Zhang,Yao Tong,Anjia Yang,Yudan Cheng,Shun Hu

from arxiv, 19 pages, 6 figures

In secure machine learning inference, most of the schemes assume that the server is semi-honest (honestly following the protocol but attempting to infer additional information). However, the server may be malicious (e.g., using a low-quality model or deviating from the protocol) in the real world. Although a few studies have considered a malicious server that deviates from the protocol, they ignore the verification of model accuracy (where the malicious server uses a low-quality model) meanwhile preserving the privacy of both the server's model and the client's inputs. To address these issues, we propose \textit{Fusion}, where the client mixes the public samples (which have known query results) with their own samples to be queried as the inputs of multi-party computation to jointly perform the secure inference. Since a server that uses a low-quality model or deviates from the protocol can only produce results that can be easily identified by the client, \textit{Fusion} forces the server to behave honestly, thereby addressing all those aforementioned issues without leveraging expensive cryptographic techniques. Our evaluation indicates that \textit{Fusion} is 48.06$\times$ faster and uses 30.90$\times$ less communication than the existing maliciously secure inference protocol (which currently does not support the verification of the model accuracy). In addition, to show the scalability, we conduct ImageNet-scale inference on the practical ResNet50 model and it costs 8.678 minutes and 10.117 GiB of communication in a WAN setting, which is 1.18$\times$ faster and has 2.64$\times$ less communication than those of the semi-honest protocol.

模型評估 · Networking · Learning · 逼真度 · 分解的 ·

2023 年 6 月 10 日

Local object crop collision network for efficient simulation of non-convex objects in GPU-based simulators

Dongwon Son,Beomjoon Kim

from arxiv, RSS 2023 //sites.google.com/view/locc-rss2023/home

Our goal is to develop an efficient contact detection algorithm for large-scale GPU-based simulation of non-convex objects. Current GPU-based simulators such as IsaacGym and Brax must trade-off speed with fidelity, generality, or both when simulating non-convex objects. Their main issue lies in contact detection (CD): existing CD algorithms, such as Gilbert-Johnson-Keerthi (GJK), must trade off their computational speed with accuracy which becomes expensive as the number of collisions among non-convex objects increases. We propose a data-driven approach for CD, whose accuracy depends only on the quality and quantity of offline dataset rather than online computation time. Unlike GJK, our method inherently has a uniform computational flow, which facilitates efficient GPU usage based on advanced compilers such as XLA (Accelerated Linear Algebra). Further, we offer a data-efficient solution by learning the patterns of colliding local crop object shapes, rather than global object shapes which are harder to learn. We demonstrate our approach improves the efficiency of existing CD methods by a factor of 5-10 for non-convex objects with comparable accuracy. Using the previous work on contact resolution for a neural-network-based contact detector, we integrate our CD algorithm into the open-source GPU-based simulator, Brax, and show that we can improve the efficiency over IsaacGym and generality over standard Brax. We highly recommend the videos of our simulator included in the supplementary materials.

圖形處理器 · Neural Networks · MoDELS · 通用近似器 · 圖 ·

2021 年 9 月 9 日

Relating Graph Neural Networks to Structural Causal Models

Matej Ze?evi?,Devendra Singh Dhami,Petar Veli?kovi?,Kristian Kersting

from arxiv, Main paper: 7 pages, References: 2 pages, Appendix: 10 pages; Main paper: 5 figures, Appendix: 3 figures

Causality can be described in terms of a structural causal model (SCM) that carries information on the variables of interest and their mechanistic relations. For most processes of interest the underlying SCM will only be partially observable, thus causal inference tries to leverage any exposed information. Graph neural networks (GNN) as universal approximators on structured input pose a viable candidate for causal learning, suggesting a tighter integration with SCM. To this effect we present a theoretical analysis from first principles that establishes a novel connection between GNN and SCM while providing an extended view on general neural-causal models. We then establish a new model class for GNN-based causal inference that is necessary and sufficient for causal effect identification. Our empirical illustration on simulations and standard benchmarks validate our theoretical proofs.

Neural Networks · Networking · 可約的 · Continuity · 推斷 ·

2021 年 6 月 21 日

A Survey of Quantization Methods for Efficient Neural Network Inference

Amir Gholami,Sehoon Kim,Zhen Dong,Zhewei Yao,Michael W. Mahoney,Kurt Keutzer

from arxiv, Book Chapter: Low-Power Computer Vision: Improving the Efficiency of Artificial Intelligence

As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.

MoDELS · Transformer模型 · 變換 · 推斷 · 模型評估 ·

2020 年 6 月 23 日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Zhuohan Li,Eric Wallace,Sheng Shen,Kevin Lin,Kurt Keutzer,Dan Klein,Joseph E. Gonzalez

from arxiv, ICML 2020

Since hardware resources are limited, the objective of training deep learning models is typically to maximize accuracy subject to the time and memory constraints of training and inference. We study the impact of model size in this setting, focusing on Transformer models for NLP tasks that are limited by compute: self-supervised pretraining and high-resource machine translation. We first show that even though smaller Transformer models execute faster per iteration, wider and deeper models converge in significantly fewer steps. Moreover, this acceleration in convergence typically outpaces the additional computational overhead of using larger models. Therefore, the most compute-efficient training strategy is to counterintuitively train extremely large models but stop after a small number of iterations. This leads to an apparent trade-off between the training efficiency of large Transformer models and the inference efficiency of small Transformer models. However, we show that large models are more robust to compression techniques such as quantization and pruning than small models. Consequently, one can get the best of both worlds: heavily compressed, large models achieve higher accuracy than lightly compressed, small models.