成人午夜性影院视频,亚洲AV无码大网站在线看,一区二三区在线中国,国产91不卡精品视频

Quantum Neural Network (QNN) is a promising application towards quantum advantage on near-term quantum hardware. However, due to the large quantum noises (errors), the performance of QNN models has a severe degradation on real quantum devices. For example, the accuracy gap between noise-free simulation and noisy results on IBMQ-Yorktown for MNIST-4 classification is over 60%. Existing noise mitigation methods are general ones without leveraging unique characteristics of QNN and are only applicable to inference; on the other hand, existing QNN work does not consider noise effect. To this end, we present RoQNN, a QNN-specific framework to perform noise-aware optimizations in both training and inference stages to improve robustness. We analytically deduct and experimentally observe that the effect of quantum noise to QNN measurement outcome is a linear map from noise-free outcome with a scaling and a shift factor. Motivated by that, we propose post-measurement normalization to mitigate the feature distribution differences between noise-free and noisy scenarios. Furthermore, to improve the robustness against noise, we propose noise injection to the training process by inserting quantum error gates to QNN according to realistic noise models of quantum hardware. Finally, post-measurement quantization is introduced to quantize the measurement outcomes to discrete values, achieving the denoising effect. Extensive experiments on 8 classification tasks using 6 quantum devices demonstrate that RoQNN improves accuracy by up to 43%, and achieves over 94% 2-class, 80% 4-class, and 34% 10-class MNIST classification accuracy measured on real quantum computers. We also open-source our PyTorch library for construction and noise-aware training of QNN at //github.com/mit-han-lab/pytorch-quantum .

相關內容

噪聲

關注 0

Neural Networks · Networking · 噪聲 · 可理解性 · 張量處理單元 ·

2021 年 12 月 16 日

Understanding and mitigating noise in trained deep neural networks

Nadezhda Semenova,Laurent Larger,Daniel Brunner

Deep neural networks unlocked a vast range of new applications by solving tasks of which many were previously deemed as reserved to higher human intelligence. One of the developments enabling this success was a boost in computing power provided by special purpose hardware, such as graphic or tensor processing units. However, these do not leverage fundamental features of neural networks like parallelism and analog state variables. Instead, they emulate neural networks relying on binary computing, which results in unsustainable energy consumption and comparatively low speed. Fully parallel and analogue hardware promises to overcome these challenges, yet the impact of analogue neuron noise and its propagation, i.e. accumulation, threatens rendering such approaches inept. Here, we determine for the first time the propagation of noise in deep neural networks comprising noisy nonlinear neurons in trained fully connected layers. We study additive and multiplicative as well as correlated and uncorrelated noise, and develop analytical methods that predict the noise level in any layer of symmetric deep neural networks or deep neural networks trained with back propagation. We find that noise accumulation is generally bound, and adding additional network layers does not worsen the signal to noise ratio beyond a limit. Most importantly, noise accumulation can be suppressed entirely when neuron activation functions have a slope smaller than unity. We therefore developed the framework for noise in fully connected deep neural networks implemented in analog systems, and identify criteria allowing engineers to design noise-resilient novel neural network hardware.

正則化項 · Neural Networks · 穩健性 · 泛化理論 · 過擬合 ·

2021 年 12 月 15 日

Robust Neural Network Classification via Double Regularization

Olof Zetterqvist,Rebecka J?rnsten,Johan Jonasson

from arxiv, 23 pages, 12 figures

The presence of mislabeled observations in data is a notoriously challenging problem in statistics and machine learning, associated with poor generalization properties for both traditional classifiers and, perhaps even more so, flexible classifiers like neural networks. Here we propose a novel double regularization of the neural network training loss that combines a penalty on the complexity of the classification model and an optimal reweighting of training observations. The combined penalties result in improved generalization properties and strong robustness against overfitting in different settings of mislabeled training data and also against variation in initial parameter values when training. We provide a theoretical justification for our proposed method derived for a simple case of logistic regression. We demonstrate the double regularization model, here denoted by DRFit, for neural net classification of (i) MNIST and (ii) CIFAR-10, in both cases with simulated mislabeling. We also illustrate that DRFit identifies mislabeled data points with very good precision. This provides strong support for DRFit as a practical of-the-shelf classifier, since, without any sacrifice in performance, we get a classifier that simultaneously reduces overfitting against mislabeling and gives an accurate measure of the trustworthiness of the labels.

Lipschitz · 圖形處理器 · 圖 · 穩健性 · Neural Networks ·

2021 年 12 月 14 日

Robust Graph Neural Networks via Probabilistic Lipschitz Constraints

Raghu Arghal,Eric Lei,Shirin Saeedi Bidokhti

Graph neural networks (GNNs) have recently been demonstrated to perform well on a variety of network-based tasks such as decentralized control and resource allocation, and provide computationally efficient methods for these tasks which have traditionally been challenging in that regard. However, like many neural-network based systems, GNNs are susceptible to shifts and perturbations on their inputs, which can include both node attributes and graph structure. In order to make them more useful for real-world applications, it is important to ensure their robustness post-deployment. Motivated by controlling the Lipschitz constant of GNN filters with respect to the node attributes, we propose to constrain the frequency response of the GNN's filter banks. We extend this formulation to the dynamic graph setting using a continuous frequency response constraint, and solve a relaxed variant of the problem via the scenario approach. This allows for the use of the same computationally efficient algorithm on sampled constraints, which provides PAC-style guarantees on the stability of the GNN using results in scenario optimization. We also highlight an important connection between this setup and GNN stability to graph perturbations, and provide experimental results which demonstrate the efficacy and broadness of our approach.

過擬合 · 泛化理論 · 示例 · MoDELS · Performer ·

2021 年 12 月 14 日

On the Impact of Hard Adversarial Instances on Overfitting in Adversarial Training

Chen Liu,Zhichao Huang,Mathieu Salzmann,Tong Zhang,Sabine Süsstrunk

Adversarial training is a popular method to robustify models against adversarial attacks. However, it exhibits much more severe overfitting than training on clean inputs. In this work, we investigate this phenomenon from the perspective of training instances, i.e., training input-target pairs. Based on a quantitative metric measuring instances' difficulty, we analyze the model's behavior on training instances of different difficulty levels. This lets us show that the decay in generalization performance of adversarial training is a result of the model's attempt to fit hard adversarial instances. We theoretically verify our observations for both linear and general nonlinear models, proving that models trained on hard instances have worse generalization performance than ones trained on easy instances. Furthermore, we prove that the difference in the generalization gap between models trained by instances of different difficulty levels increases with the size of the adversarial budget. Finally, we conduct case studies on methods mitigating adversarial overfitting in several scenarios. Our analysis shows that methods successfully mitigating adversarial overfitting all avoid fitting hard adversarial instances, while ones fitting hard adversarial instances do not achieve true robustness.

Principle · 端到端 · Networking · 結點 · Processing（編程語言） ·

2021 年 12 月 14 日

Towards End-to-End Error Management for a Quantum Internet

Shota Nagayama

from arxiv, 12 pages, 9 figures

Error management in the quantum Internet requires stateful and stochastic processing across multiple nodes, which is a significant burden. In view of the history of the current Internet, the end-to-end principle was devised for error management, simplifying the work inside the network and contributing significantly to the scalability of the Internet. In this paper, we propose to bring the end-to-end principle into the error management of quantum Internet to improve the communication resource utilization efficiency of a quantum Internet. The simulation results show that the error management using the end-to-end principle and locality can be more resource-efficient than other settings. In addition, when end-to-end error management is used, if the error probability of qubits in the end node is sufficiently low, there is no problem even if the error probability on the network side is higher than that in the end node, and the load on the network can be reduced. Our proposal will contribute to improving the communication capacity and scalability of the quantum Internet, as well as to improve the interoperability of quantum Autonomous Systems. In addition, existing studies on routing and other aspects of the quantum Internet may exclude error management from their scope due to its complexity. The results of this study provide validity to the assumptions of such studies.

SGD · 泛函 · 隨機梯度下降 · 優化器 · ReLU ·

2021 年 12 月 13 日

Convergence proof for stochastic gradient descent in the training of deep neural networks with ReLU activation for constant target functions

Martin Hutzenthaler,Arnulf Jentzen,Katharina Pohl,Adrian Riekert,Luca Scarpa

from arxiv, 52 pages, 1 figure. arXiv admin note: text overlap with arXiv:2104.00277, arXiv:2107.04479

In many numerical simulations stochastic gradient descent (SGD) type optimization methods perform very effectively in the training of deep neural networks (DNNs) but till this day it remains an open problem of research to provide a mathematical convergence analysis which rigorously explains the success of SGD type optimization methods in the training of DNNs. In this work we study SGD type optimization methods in the training of fully-connected feedforward DNNs with rectified linear unit (ReLU) activation. We first establish general regularity properties for the risk functions and their generalized gradient functions appearing in the training of such DNNs and, thereafter, we investigate the plain vanilla SGD optimization method in the training of such DNNs under the assumption that the target function under consideration is a constant function. Specifically, we prove under the assumption that the learning rates (the step sizes of the SGD optimization method) are sufficiently small but not $L^1$-summable and under the assumption that the target function is a constant function that the expectation of the riskof the considered SGD process converges in the training of such DNNs to zero as the number of SGD steps increases to infinity.

Neural Networks · Networking · 文本分類 · 支持向量 · 向量化 ·

2019 年 10 月 28 日

A Comparison of Neural Network Training Methods for Text Classification

Anderson de Andrade

We study the impact of neural networks in text classification. Our focus is on training deep neural networks with proper weight initialization and greedy layer-wise pretraining. Results are compared with 1-layer neural networks and Support Vector Machines. We work with a dataset of labeled messages from the Twitter microblogging service and aim to predict weather conditions. A feature extraction procedure specific for the task is proposed, which applies dimensionality reduction using Latent Semantic Analysis. Our results show that neural networks outperform Support Vector Machines with Gaussian kernels, noticing performance gains from introducing additional hidden layers with nonlinearities. The impact of using Nesterov's Accelerated Gradient in backpropagation is also studied. We conclude that deep neural networks are a reasonable approach for text classification and propose further ideas to improve performance.

目標跟蹤 · 邊緣化 · 穩健性 · 判別器 · 卷積 ·

2018 年 7 月 19 日

Large Margin Structured Convolution Operator for Thermal Infrared Object Tracking

Peng Gao,Yipeng Ma,Ke Song,Chao Li,Fei Wang,Liyi Xiao

from arxiv, Accepted as contributed paper in ICPR'18

Compared with visible object tracking, thermal infrared (TIR) object tracking can track an arbitrary target in total darkness since it cannot be influenced by illumination variations. However, there are many unwanted attributes that constrain the potentials of TIR tracking, such as the absence of visual color patterns and low resolutions. Recently, structured output support vector machine (SOSVM) and discriminative correlation filter (DCF) have been successfully applied to visible object tracking, respectively. Motivated by these, in this paper, we propose a large margin structured convolution operator (LMSCO) to achieve efficient TIR object tracking. To improve the tracking performance, we employ the spatial regularization and implicit interpolation to obtain continuous deep feature maps, including deep appearance features and deep motion features, of the TIR targets. Finally, a collaborative optimization strategy is exploited to significantly update the operators. Our approach not only inherits the advantage of the strong discriminative capability of SOSVM but also achieves accurate and robust tracking with higher-dimensional features and more dense samples. To the best of our knowledge, we are the first to incorporate the advantages of DCF and SOSVM for TIR object tracking. Comprehensive evaluations on two thermal infrared tracking benchmarks, i.e. VOT-TIR2015 and VOT-TIR2016, clearly demonstrate that our LMSCO tracker achieves impressive results and outperforms most state-of-the-art trackers in terms of accuracy and robustness with sufficient frame rate.

生成式對抗網絡 · Networking · 量子機器學習 · Machine Learning · SimPLe ·

2018 年 4 月 30 日

Quantum generative adversarial networks

Pierre-Luc Dallaire-Demers,Nathan Killoran

from arxiv, 10 pages, 8 figures

Quantum machine learning is expected to be one of the first potential general-purpose applications of near-term quantum devices. A major recent breakthrough in classical machine learning is the notion of generative adversarial training, where the gradients of a discriminator model are used to train a separate generative model. In this work and a companion paper, we extend adversarial training to the quantum domain and show how to construct generative adversarial networks using quantum circuits. Furthermore, we also show how to compute gradients -- a key element in generative adversarial network training -- using another quantum circuit. We give an example of a simple practical circuit ansatz to parametrize quantum machine learning models and perform a simple numerical experiment to demonstrate that quantum generative adversarial networks can be trained successfully.

TBD · Performer · DNN · TOOLS · Neural Networks ·

2018 年 3 月 16 日

TBD: Benchmarking and Analyzing Deep Neural Network Training

Hongyu Zhu,Mohamed Akrout,Bojian Zheng,Andrew Pelegris,Amar Phanishayee,Bianca Schroeder,Gennady Pekhimenko

The recent popularity of deep neural networks (DNNs) has generated a lot of research interest in performing DNN-related computation efficiently. However, the primary focus is usually very narrow and limited to (i) inference -- i.e. how to efficiently execute already trained models and (ii) image classification networks as the primary benchmark for evaluation. Our primary goal in this work is to break this myopic view by (i) proposing a new benchmark for DNN training, called TBD (TBD is short for Training Benchmark for DNNs), that uses a representative set of DNN models that cover a wide range of machine learning applications: image classification, machine translation, speech recognition, object detection, adversarial networks, reinforcement learning, and (ii) by performing an extensive performance analysis of training these different applications on three major deep learning frameworks (TensorFlow, MXNet, CNTK) across different hardware configurations (single-GPU, multi-GPU, and multi-machine). TBD currently covers six major application domains and eight different state-of-the-art models. We present a new toolchain for performance analysis for these models that combines the targeted usage of existing performance analysis tools, careful selection of new and existing metrics and methodologies to analyze the results, and utilization of domain specific characteristics of DNN training. We also build a new set of tools for memory profiling in all three major frameworks; much needed tools that can finally shed some light on precisely how much memory is consumed by different data structures (weights, activations, gradients, workspace) in DNN training. By using our tools and methodologies, we make several important observations and recommendations on where the future research and optimization of DNN training should be focused.