又黄又爽又色的视频免费_亚洲成A人片在线观看网站黄_一级毛片免费AAA202AAAA_国产AV天堂久久久久_中文字幕视频精品一区二区_精品无码中文一区二区三区_欧美人与性动交Α欧美精品

Erasure coding (EC) affords data redundancy for large-scale systems. XOR-based EC is an easy-to-implement method for optimizing EC. This paper addresses a significant performance gap between the state-of-the-art XOR-based EC approach (with 4.9 GB/s coding throughput) and Intel's high-performance EC library based on another approach (with 6.7 GB/s). We propose a novel approach based on our observation that XOR-based EC virtually generates programs of a Domain Specific Language for XORing byte arrays. We formalize such programs as straight-line programs (SLPs) of compiler construction and optimize SLPs using various optimization techniques. Our optimization flow is three-fold: 1) reducing operations using grammar compression algorithms; 2) reducing memory accesses using deforestation, a functional program optimization method; and 3) reducing cache misses using the (red-blue) pebble game of program analysis. We provide an experimental library, which outperforms Intel's library with 8.92 GB/s throughput.

相關內容

可約的

關注 2

Performer · Storage · 服務器 · 同態加密 · 示例 ·

2021 年 10 月 5 日

VESPo: Verified Evaluation of Secret Polynomials

Jean-Guillaume Dumas,Aude Maignan,Clément Pernet,Daniel S. Roche

We consider the problem of efficiently evaluating a secret polynomial at a given public point, when the polynomial is stored on an untrusted server. The server performs the evaluation and returns a certificate, and the client can efficiently check that the evaluation is correct using some pre-computed keys. Our protocols support two important features: the polynomial itself can be encrypted on the server, and it can be dynamically updated by changing individual coefficients cheaply without redoing the entire setup. As an important application, we show how these new techniques can be used to instantiate a Dynamic Proof of Retrievability (DPoR) for arbitrary outsourced data storage that achieves low server storage size and audit complexity. Our methods rely only on linearly homomorphic encryption and pairings, and preliminary timing results indicate reasonable performance for polynomials with millions of coefficients, and efficient DPoR with for instance 1TB size databases.

Neural Networks · Networking · INFORMS · 置信度 · MoDELS ·

2021 年 10 月 5 日

Neural Network Adversarial Attack Method Based on Improved Genetic Algorithm

Dingming Yang,Yanrong Cui,Hongqiang Yuan

from arxiv, 10 pages, 7 figures, 3 tables and 28 References

Deep learning algorithms are widely used in fields such as computer vision and natural language processing, but they are vulnerable to security threats from adversarial attacks because of their internal presence of a large number of nonlinear functions and parameters leading to their uninterpretability. In this paper, we propose a neural network adversarial attack method based on an improved genetic algorithm. The improved genetic algorithm improves the variation and crossover links based on the original genetic optimization algorithm, which greatly improves the iteration efficiency and shortens the running time. The method does not need the internal structure and parameter information of the neural network model, and it can obtain the adversarial samples with high confidence in a short time by the classification and confidence information of the neural network. The experimental results show that the method in this paper has a wide range of applicability and high efficiency for the model, and provides a new idea for the adversarial attack.

超參數 · 可約的 · 優化器 · MoDELS · 圖片分類 ·

2021 年 10 月 4 日

HYPPO: A Surrogate-Based Multi-Level Parallelism Tool for Hyperparameter Optimization

Vincent Dumont,Casey Garner,Anuradha Trivedi,Chelsea Jones,Vidya Ganapati,Juliane Mueller,Talita Perciano,Mariam Kiran,Marc Day

from arxiv, 13 pages, 12 figures - Accepted to SC21 conference

We present a new software, HYPPO, that enables the automatic tuning of hyperparameters of various deep learning (DL) models. Unlike other hyperparameter optimization (HPO) methods, HYPPO uses adaptive surrogate models and directly accounts for uncertainty in model predictions to find accurate and reliable models that make robust predictions. Using asynchronous nested parallelism, we are able to significantly alleviate the computational burden of training complex architectures and quantifying the uncertainty. HYPPO is implemented in Python and can be used with both TensorFlow and PyTorch libraries. We demonstrate various software features on time-series prediction and image classification problems as well as a scientific application in computed tomography image reconstruction. Finally, we show that (1) we can reduce by an order of magnitude the number of evaluations necessary to find the most optimal region in the hyperparameter space and (2) we can reduce by two orders of magnitude the throughput for such HPO process to complete.

Networking · 可辨認的 · Taxonomy · Next · Processing（編程語言） ·

2021 年 10 月 1 日

The Programmable Data Plane: Abstractions, Architectures, Algorithms, and Applications

Oliver Michel,Roberto Bifulco,Gabor Retvari,Stefan Schmid

Programmable data plane technology enables the systematic reconfiguration of the low-level processing steps applied to network packets and is a key driver in realizing the next generation of network services and applications. This survey presents recent trends and issues in the design and implementation of programmable network devices, focusing on prominent architectures, abstractions, algorithms, and applications proposed, debated, and realized over the past years. We elaborate on the trends that led to the emergence of this technology and highlight the most important pointers from the literature, casting different taxonomies for the field and identifying avenues for future research.

TACO · Performer · Storage · 環 · 編譯器 ·

2021 年 10 月 1 日

An Attempt to Generate Code for Symmetric Tensor Computations

Jessica Shi,Stephen Chou,Fredrik Kjolstad,Saman Amarasinghe

This document describes an attempt to develop a compiler-based approach for computations with symmetric tensors. Given a computation and the symmetries of its input tensors, we derive formulas for random access under a storage scheme that eliminates redundancies; construct intermediate representations to describe the loop structure; and translate this information, using the taco tensor algebra compiler, into code. While we achieve a framework for reasoning about a fairly general class of symmetric computations, the resulting code is not performant when the symmetries are misaligned.

GPU · 路徑 · INFORMS · Better · 估計/估計量 ·

2021 年 9 月 30 日

Accelerating Inverse Rendering By Using a GPU and Reuse of Light Paths

Ido Czerninski,Yoav Y. Schechner

from arxiv, 31 pages, 21 figures

Inverse rendering seeks to estimate scene characteristics from a set of data images. The dominant approach is based on differential rendering using Monte-Carlo. Algorithms as such usually rely on a forward model and use an iterative gradient method that requires sampling millions of light paths per iteration. This paper presents an efficient framework that speeds up existing inverse rendering algorithms. This is achieved by tailoring the iterative process of inverse rendering specifically to a GPU architecture. For this cause, we introduce two interleaved steps - Path Sorting and Path Recycling. Path Sorting allows the GPU to deal with light paths of the same size. Path Recycling allows the algorithm to use light paths from previous iterations to better utilize the information they encode. Together, these steps significantly speed up gradient optimization. In this paper, we give the theoretical background for Path Recycling. We demonstrate its efficiency for volumetric scattering tomography and reflectometry (surface reflections).

圖形處理器 · Neural Networks · 圖 · 規范化的 · Principle ·

2021 年 2 月 16 日

GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training

Tianle Cai,Shengjie Luo,Keyulu Xu,Di He,Tie-Yan Liu,Liwei Wang

Normalization is known to help the optimization of deep neural networks. Curiously, different architectures require specialized normalization methods. In this paper, we study what normalization is effective for Graph Neural Networks (GNNs). First, we adapt and evaluate the existing methods from other domains to GNNs. Faster convergence is achieved with InstanceNorm compared to BatchNorm and LayerNorm. We provide an explanation by showing that InstanceNorm serves as a preconditioner for GNNs, but such preconditioning effect is weaker with BatchNorm due to the heavy batch noise in graph datasets. Second, we show that the shift operation in InstanceNorm results in an expressiveness degradation of GNNs for highly regular graphs. We address this issue by proposing GraphNorm with a learnable shift. Empirically, GNNs with GraphNorm converge faster compared to GNNs using other normalization. GraphNorm also improves the generalization of GNNs, achieving better performance on graph classification benchmarks.

優化器 · tuning · 學成 · 超參數 · 強化學習 ·

2020 年 2 月 14 日

Reinforcement Learning Enhanced Quantum-inspired Algorithm for Combinatorial Optimization

Dmitrii Beloborodov,A. E. Ulanov,Jakob N. Foerster,Shimon Whiteson,A. I. Lvovsky

from arxiv, Submitted to ICML 2020. 9 pages, 3 pdf figures. V2: fixed acknowledgements

Quantum hardware and quantum-inspired algorithms are becoming increasingly popular for combinatorial optimization. However, these algorithms may require careful hyperparameter tuning for each problem instance. We use a reinforcement learning agent in conjunction with a quantum-inspired algorithm to solve the Ising energy minimization problem, which is equivalent to the Maximum Cut problem. The agent controls the algorithm by tuning one of its parameters with the goal of improving recently seen solutions. We propose a new Rescaled Ranked Reward (R3) method that enables stable single-player version of self-play training that helps the agent to escape local optima. The training on any problem instance can be accelerated by applying transfer learning from an agent trained on randomly generated problems. Our approach allows sampling high-quality solutions to the Ising problem with high probability and outperforms both baseline heuristics and a black-box hyperparameter optimization approach.

可約的 · 圖像分割 · 3D · MoDELS · Neural Networks ·

2018 年 12 月 19 日

Fast and Accurate 3D Medical Image Segmentation with Data-swapping Method

Haruki Imai,Samuel Matzek,Tung D. Le,Yasushi Negishi,Kiyokuni Kawachiya

from arxiv, 13 pages

Deep neural network models used for medical image segmentation are large because they are trained with high-resolution three-dimensional (3D) images. Graphics processing units (GPUs) are widely used to accelerate the trainings. However, the memory on a GPU is not large enough to train the models. A popular approach to tackling this problem is patch-based method, which divides a large image into small patches and trains the models with these small patches. However, this method would degrade the segmentation quality if a target object spans multiple patches. In this paper, we propose a novel approach for 3D medical image segmentation that utilizes the data-swapping, which swaps out intermediate data from GPU memory to CPU memory to enlarge the effective GPU memory size, for training high-resolution 3D medical images without patching. We carefully tuned parameters in the data-swapping method to obtain the best training performance for 3D U-Net, a widely used deep neural network model for medical image segmentation. We applied our tuning to train 3D U-Net with full-size images of 192 x 192 x 192 voxels in brain tumor dataset. As a result, communication overhead, which is the most important issue, was reduced by 17.1%. Compared with the patch-based method for patches of 128 x 128 x 128 voxels, our training for full-size images achieved improvement on the mean Dice score by 4.48% and 5.32 % for detecting whole tumor sub-region and tumor core sub-region, respectively. The total training time was reduced from 164 hours to 47 hours, resulting in 3.53 times of acceleration.

去噪 · 白盒 · 穩健性 · Networking · 模型評估 ·

2018 年 12 月 9 日

Feature Denoising for Improving Adversarial Robustness

Cihang Xie,Yuxin Wu,Laurens van der Maaten,Alan Yuille,Kaiming He

from arxiv, tech report

Adversarial attacks to image classification systems present challenges to convolutional networks and opportunities for understanding them. This study suggests that adversarial perturbations on images lead to noise in the features constructed by these networks. Motivated by this observation, we develop new network architectures that increase adversarial robustness by performing feature denoising. Specifically, our networks contain blocks that denoise the features using non-local means or other filters; the entire networks are trained end-to-end. When combined with adversarial training, our feature denoising networks substantially improve the state-of-the-art in adversarial robustness in both white-box and black-box attack settings. On ImageNet, under 10-iteration PGD white-box attacks where prior art has 27.9% accuracy, our method achieves 55.7%; even under extreme 2000-iteration PGD white-box attacks, our method secures 42.6% accuracy. A network based on our method was ranked first in Competition on Adversarial Attacks and Defenses (CAAD) 2018 --- it achieved 50.6% classification accuracy on a secret, ImageNet-like test dataset against 48 unknown attackers, surpassing the runner-up approach by ~10%. Code and models will be made publicly available.