亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<tr id='ni1cc'><strong id='ni1cc'></strong><small id='ni1cc'></small><button id='ni1cc'></button><li id='ni1cc'><noscript id='ni1cc'><big id='ni1cc'></big><dt id='ni1cc'></dt></noscript></li></tr><ol id='ni1cc'><option id='ni1cc'><table id='ni1cc'><blockquote id='ni1cc'><tbody id='ni1cc'></tbody></blockquote></table></option></ol><u id='ni1cc'></u><kbd id='ni1cc'><kbd id='ni1cc'></kbd></kbd>

<code id='ni1cc'><strong id='ni1cc'></strong></code>

<fieldset id='ni1cc'></fieldset>

<span id='ni1cc'></span>

<ins id='ni1cc'></ins>

<acronym id='ni1cc'><em id='ni1cc'></em><td id='ni1cc'><div id='ni1cc'></div></td></acronym><address id='ni1cc'><big id='ni1cc'><big id='ni1cc'></big><legend id='ni1cc'></legend></big></address>

<i id='ni1cc'><div id='ni1cc'><ins id='ni1cc'></ins></div></i>

<i id='ni1cc'></i>

·

可約的 · 神經形態計算 · 編譯器 · 推斷 · 設計 ·

2021 年 11 月 23 日

Design of Many-Core Big Little μBrain for Energy-Efficient Embedded Neuromorphic Computing

M. Lakshmi Varshika,Adarsha Balaji,Federico Corradi,Anup Das,Jan Stuijt,Francky Catthoor

from arxiv, Accepted for publication at DATE 2022

As spiking-based deep learning inference applications are increasing in embedded systems, these systems tend to integrate neuromorphic accelerators such as $\mu$Brain to improve energy efficiency. We propose a $\mu$Brain-based scalable many-core neuromorphic hardware design to accelerate the computations of spiking deep convolutional neural networks (SDCNNs). To increase energy efficiency, cores are designed to be heterogeneous in terms of their neuron and synapse capacity (big cores have higher capacity than the little ones), and they are interconnected using a parallel segmented bus interconnect, which leads to lower latency and energy compared to a traditional mesh-based Network-on-Chip (NoC). We propose a system software framework called SentryOS to map SDCNN inference applications to the proposed design. SentryOS consists of a compiler and a run-time manager. The compiler compiles an SDCNN application into subnetworks by exploiting the internal architecture of big and little $\mu$Brain cores. The run-time manager schedules these sub-networks onto cores and pipeline their execution to improve throughput. We evaluate the proposed big little many-core neuromorphic design and the system software framework with five commonlyused SDCNN inference applications and show that the proposed solution reduces energy (between 37% and 98%), reduces latency (between 9% and 25%), and increases application throughput (between 20% and 36%). We also show that SentryOS can be easily extended for other spiking neuromorphic accelerators.

相關內容

可約的

Neural Networks · Networking · INFORMS · Processing（編程語言） · MoDELS ·

2022 年 1 月 26 日

The BrainScaleS-2 accelerated neuromorphic system with hybrid plasticity

Christian Pehle,Sebastian Billaudelle,Benjamin Cramer,Jakob Kaiser,Korbinian Schreiber,Yannik Stradmann,Johannes Weis,Aron Leibfried,Eric Müller,Johannes Schemmel

from arxiv, 22 pages, 10 figures

Since the beginning of information processing by electronic components, the nervous system has served as a metaphor for the organization of computational primitives. Brain-inspired computing today encompasses a class of approaches ranging from using novel nano-devices for computation to research into large-scale neuromorphic architectures, such as TrueNorth, SpiNNaker, BrainScaleS, Tianjic, and Loihi. While implementation details differ, spiking neural networks -- sometimes referred to as the third generation of neural networks -- are the common abstraction used to model computation with such systems. Here we describe the second generation of the BrainScaleS neuromorphic architecture, emphasizing applications enabled by this architecture. It combines a custom analog accelerator core supporting the accelerated physical emulation of bio-inspired spiking neural network primitives with a tightly coupled digital processor and a digital event-routing network.

GROUP · Processing（編程語言） · 控制器 · API · INFORMS ·

2022 年 1 月 26 日

Automatic detection of access control vulnerabilities via API specification processing

Alexander Barabanov,Denis Dergunov,Denis Makrushin,Aleksey Teplov

Objective. Insecure Direct Object Reference (IDOR) or Broken Object Level Authorization (BOLA) are one of the critical type of access control vulnerabilities for modern applications. As a result, an attacker can bypass authorization checks leading to information leakage, account takeover. Our main research goal was to help an application security architect to optimize security design and testing process by giving an algorithm and tool that allows to automatically analyze system API specifications and generate list of possible vulnerabilities and attack vector ready to be used as security non-functional requirements. Method. We conducted a multivocal review of research and conference papers, bug bounty program reports and other grey sources of literature to outline patterns of attacks against IDOR vulnerability. These attacks are collected in groups proceeding with further analysis common attributes between these groups and what features compose the group. Endpoint properties and attack techniques comprise a group of attacks. Mapping between group features and existing OpenAPI specifications is performed to implement a tool for automatic discovery of potentially vulnerable endpoints. Results and practical relevance. In this work, we provide systematization of IDOR/BOLA attack techniques based on literature review, real cases analysis and derive IDOR/BOLA attack groups. We proposed an approach to describe IDOR/BOLA attacks based on OpenAPI specifications properties. We develop an algorithm of potential IDOR/BOLA vulnerabilities detection based on OpenAPI specification processing. We implemented our novel algorithm using Python and evaluated it. The results show that algorithm is resilient and can be used in practice to detect potential IDOR/BOLA vulnerabilities.

MoDELS · GPU · 學成 · 模型并行 · state-of-the-art ·

2022 年 1 月 25 日

Hydra: A System for Large Multi-Model Deep Learning

Kabir Nagrecha,Arun Kumar

from arxiv, 12 pages including references. Preprint

Training deep learning (DL) models that do not fit into the memory of a single GPU is a vexed process, forcing users to procure multiple GPUs to adopt model-parallel execution. Unfortunately, sequential dependencies in neural architectures often block efficient multi-device training, leading to suboptimal performance. We present 'model spilling', a technique aimed at models such as Transformers and CNNs to move groups of layers, or shards, between DRAM and GPU memory, thus enabling arbitrarily large models to be trained even on just one GPU. We then present a set of novel techniques leveraging spilling to raise efficiency for multi-model training workloads such as model selection: a new hybrid of task- and model-parallelism, a new shard scheduling heuristic, and 'double buffering' to hide latency. We prototype our ideas into a system we call HYDRA to support seamless single-model and multi-model training of large DL models. Experiments with real benchmark workloads show that HYDRA is over 7x faster than regular model parallelism and over 50% faster than state-of-the-art industrial tools for pipeline parallelism.

MoDELS · Storage · 學成 · 深度學習 · 可約的 ·

2022 年 1 月 25 日

Serving Deep Learning Models with Deduplication from Relational Databases

Lixi Zhou,Jiaqing Chen,Amitabh Das,Hong Min,Lei Yu,Ming Zhao,Jia Zou

There are significant benefits to serve deep learning models from relational databases. First, features extracted from databases do not need to be transferred to any decoupled deep learning systems for inferences, and thus the system management overhead can be significantly reduced. Second, in a relational database, data management along the storage hierarchy is fully integrated with query processing, and thus it can continue model serving even if the working set size exceeds the available memory. Applying model deduplication can greatly reduce the storage space, memory footprint, cache misses, and inference latency. However, existing data deduplication techniques are not applicable to the deep learning model serving applications in relational databases. They do not consider the impacts on model inference accuracy as well as the inconsistency between tensor blocks and database pages. This work proposed synergistic storage optimization techniques for duplication detection, page packing, and caching, to enhance database systems for model serving. We implemented the proposed approach in netsDB, an object-oriented relational database. Evaluation results show that our proposed techniques significantly improved the storage efficiency and the model inference latency, and serving models from relational databases outperformed existing deep learning frameworks when the working set size exceeds available memory.

Extensibility · 有向非循環圖 · 可辨認的 · 邊 · Networking ·

2022 年 1 月 25 日

Energy-Efficient Computation Offloading in MobileEdge Computing Systems with Uncertainties

Tianxi Ji,Changqing Luo,Lixing Yu,Qianlong Wang,Siheng Chen,Arun Thapa,Pan Li

Computation offloading is indispensable for mobile edge computing (MEC). It uses edge resources to enable intensive computations and save energy for resource-constrained devices. Existing works generally impose strong assumptions on radio channels and network queue sizes. However, practical MEC systems are subject to various uncertainties rendering these assumptions impractical. In this paper, we investigate the energy-efficient computation offloading problem by relaxing those common assumptions and considering intrinsic uncertainties in the network. Specifically, we minimize the worst-case expected energy consumption of a local device when executing a time-critical application modeled as a directed acyclic graph. We employ the extreme value theory to bound the occurrence probability of uncertain events. To solve the formulated problem, we develop an $\epsilon$-bounded approximation algorithm based on column generation. The proposed algorithm can efficiently identify a feasible solution that is less than (1+$\epsilon$) of the optimal one. We implement the proposed scheme on an Android smartphone and conduct extensive experiments using a real-world application. Experiment results corroborate that it will lead to lower energy consumption for the client device by considering the intrinsic uncertainties during computation offloading. The proposed computation offloading scheme also significantly outperforms other schemes in terms of energy saving.

MoDELS · 學成 · 深度學習 · CIFAR-10 · state-of-the-art ·

2021 年 10 月 19 日

DPNAS: Neural Architecture Search for Deep Learning with Differential Privacy

Anda Cheng,Jiaxing Wang,Xi Sheryl Zhang,Qiang Chen,Peisong Wang,Jian Cheng

Training deep neural networks (DNNs) for meaningful differential privacy (DP) guarantees severely degrades model utility. In this paper, we demonstrate that the architecture of DNNs has a significant impact on model utility in the context of private deep learning, whereas its effect is largely unexplored in previous studies. In light of this missing, we propose the very first framework that employs neural architecture search to automatic model design for private deep learning, dubbed as DPNAS. To integrate private learning with architecture search, we delicately design a novel search space and propose a DP-aware method for training candidate models. We empirically certify the effectiveness of the proposed framework. The searched model DPNASNet achieves state-of-the-art privacy/utility trade-offs, e.g., for the privacy budget of $(\epsilon, \delta)=(3, 1\times10^{-5})$, our model obtains test accuracy of $98.57\%$ on MNIST, $88.09\%$ on FashionMNIST, and $68.33\%$ on CIFAR-10. Furthermore, by studying the generated architectures, we provide several intriguing findings of designing private-learning-friendly DNNs, which can shed new light on model design for deep learning with differential privacy.

圖形處理器 · MoDELS · Networking · Neural Networks · 圖 ·

2021 年 6 月 9 日

Cross-Node Federated Graph Neural Network for Spatio-Temporal Data Modeling

Chuizheng Meng,Sirisha Rambhatla,Yan Liu

from arxiv, To be published in the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 21)

Vast amount of data generated from networks of sensors, wearables, and the Internet of Things (IoT) devices underscores the need for advanced modeling techniques that leverage the spatio-temporal structure of decentralized data due to the need for edge computation and licensing (data access) issues. While federated learning (FL) has emerged as a framework for model training without requiring direct data sharing and exchange, effectively modeling the complex spatio-temporal dependencies to improve forecasting capabilities still remains an open problem. On the other hand, state-of-the-art spatio-temporal forecasting models assume unfettered access to the data, neglecting constraints on data sharing. To bridge this gap, we propose a federated spatio-temporal model -- Cross-Node Federated Graph Neural Network (CNFGNN) -- which explicitly encodes the underlying graph structure using graph neural network (GNN)-based architecture under the constraint of cross-node federated learning, which requires that data in a network of nodes is generated locally on each node and remains decentralized. CNFGNN operates by disentangling the temporal dynamics modeling on devices and spatial dynamics on the server, utilizing alternating optimization to reduce the communication cost, facilitating computations on the edge devices. Experiments on the traffic flow forecasting task show that CNFGNN achieves the best forecasting performance in both transductive and inductive learning settings with no extra computation cost on edge devices, while incurring modest communication cost.

學成 · 預測器/決策函數 · 深度學習 · MoDELS · 激活函數 ·

2019 年 4 月 10 日

Deep Learning for Energy Markets

Michael Polson,Vadim Sokolov

Deep Learning is applied to energy markets to predict extreme loads observed in energy grids. Forecasting energy loads and prices is challenging due to sharp peaks and troughs that arise due to supply and demand fluctuations from intraday system constraints. We propose deep spatio-temporal models and extreme value theory (EVT) to capture theses effects and in particular the tail behavior of load spikes. Deep LSTM architectures with ReLU and $\tanh$ activation functions can model trends and temporal dependencies while EVT captures highly volatile load spikes above a pre-specified threshold. To illustrate our methodology, we use hourly price and demand data from 4719 nodes of the PJM interconnection, and we construct a deep predictor. We show that DL-EVT outperforms traditional Fourier time series methods, both in-and out-of-sample, by capturing the observed nonlinearities in prices. Finally, we conclude with directions for future research.

可約的 · 膨脹卷積 · Performer · 卷積 · 學成 ·

2018 年 9 月 11 日

Efficient Road Lane Marking Detection with Deep Learning

Ping-Rong Chen,Shao-Yuan Lo,Hsueh-Ming Hang,Sheng-Wei Chan,Jing-Jhih Lin

from arxiv, Accepted at International Conference on Digital Signal Processing (DSP) 2018

Lane mark detection is an important element in the road scene analysis for Advanced Driver Assistant System (ADAS). Limited by the onboard computing power, it is still a challenge to reduce system complexity and maintain high accuracy at the same time. In this paper, we propose a Lane Marking Detector (LMD) using a deep convolutional neural network to extract robust lane marking features. To improve its performance with a target of lower complexity, the dilated convolution is adopted. A shallower and thinner structure is designed to decrease the computational cost. Moreover, we also design post-processing algorithms to construct 3rd-order polynomial models to fit into the curved lanes. Our system shows promising results on the captured road scenes.

話題模型 · LDA · 概率潛在語義索引 · 收縮的吉布斯抽樣 · Extensibility ·

2018 年 4 月 10 日

Towards Training Probabilistic Topic Models on Neuromorphic Multi-chip Systems

Zihao Xiao,Jianfei Chen,Jun Zhu

from arxiv, Accepted by AAAI-18, oral

Probabilistic topic models are popular unsupervised learning methods, including probabilistic latent semantic indexing (pLSI) and latent Dirichlet allocation (LDA). By now, their training is implemented on general purpose computers (GPCs), which are flexible in programming but energy-consuming. Towards low-energy implementations, this paper investigates their training on an emerging hardware technology called the neuromorphic multi-chip systems (NMSs). NMSs are very effective for a family of algorithms called spiking neural networks (SNNs). We present three SNNs to train topic models. The first SNN is a batch algorithm combining the conventional collapsed Gibbs sampling (CGS) algorithm and an inference SNN to train LDA. The other two SNNs are online algorithms targeting at both energy- and storage-limited environments. The two online algorithms are equivalent with training LDA by using maximum-a-posterior estimation and maximizing the semi-collapsed likelihood, respectively. They use novel, tailored ordinary differential equations for stochastic optimization. We simulate the new algorithms and show that they are comparable with the GPC algorithms, while being suitable for NMS implementation. We also propose an extension to train pLSI and a method to prune the network to obey the limited fan-in of some NMSs.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

神經形態計算

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tfoot id='ni1cc'></tfoot>

<legend id='ni1cc'><style id='ni1cc'><dir id='ni1cc'><q id='ni1cc'></q></dir></style></legend>

<i id='ni1cc'><tr id='ni1cc'><dt id='ni1cc'><q id='ni1cc'><span id='ni1cc'><b id='ni1cc'><form id='ni1cc'><ins id='ni1cc'></ins><ul id='ni1cc'></ul><sub id='ni1cc'></sub></form><legend id='ni1cc'></legend><bdo id='ni1cc'><pre id='ni1cc'><center id='ni1cc'></center></pre></bdo></b><th id='ni1cc'></th></span></q></dt></tr></i><div id='ni1cc'><tfoot id='ni1cc'></tfoot><dl id='ni1cc'><fieldset id='ni1cc'></fieldset></dl></div>