四虎亚洲精品高清在线观看_东京热加勒比中文无码_国产日韩欧美高清动态美图_男模野外自慰chinese_亚洲欧洲日产最新_国产女扒开猛进视频在线播放_日韩一区二区三区视频在线看

The deployment of inference services at the network edge, called edge inference, offloads computation-intensive inference tasks from mobile devices to edge servers, thereby enhancing the former's capabilities and battery lives. In a multiuser system, the joint allocation of communication-and-computation ($\text{C}^\text{2}$) resources (i.e., scheduling and bandwidth allocation) is made challenging by adopting efficient inference techniques, batching and early exiting, and further complicated by the heterogeneity in users' requirements on accuracy and latency. Batching groups multiple tasks into one batch for parallel processing to reduce time-consuming memory access and thereby boosts the throughput (i.e., completed task per second). On the other hand, early exiting allows a task to exit from a deep-neural network without traversing the whole network to support a tradeoff between accuracy and latency. In this work, we study optimal $\text{C}^\text{2}$ resource allocation with batching and early exiting, which is an NP-complete integer programming problem. A set of efficient algorithms are designed under the criterion of maximum throughput by tackling the challenge. Experimental results demonstrate that both optimal and sub-optimal $\text{C}^\text{2}$ resource allocation algorithms can leverage integrated batching and early exiting to double the inference throughput compared with conventional schemes.

相關內容

推斷

關注 5

binary · MoDELS · Performer · prototype · Performance ·

2023 年 2 月 28 日

ProSpeCT: Provably Secure Speculation for the Constant-Time Policy (Extended version)

Lesly-Ann Daniel,Marton Bognar,Job Noorman,Sébastien Bardin,Tamara Rezk,Frank Piessens

from arxiv, Technical report for our paper accepted at the 32nd USENIX Security Symposium (2023), 56 pages

We propose ProSpeCT, a generic formal processor model providing provably secure speculation for the constant-time policy. For constant-time programs under a non-speculative semantics, ProSpeCT guarantees that speculative and out-of-order execution cause no microarchitectural leaks. This guarantee is achieved by tracking secrets in the processor pipeline and ensuring that they do not influence the microarchitectural state during speculative execution. Our formalization covers a broad class of speculation mechanisms, generalizing prior work. As a result, our security proof covers all known Spectre attacks, including load value injection (LVI) attacks. In addition to the formal model, we provide a prototype hardware implementation of ProSpeCT on a RISC-V processor and show evidence of its low impact on hardware cost, performance, and required software changes. In particular, the experimental evaluation confirms our expectation that for a compliant constant-time binary, enabling ProSpeCT incurs no performance overhead.

語言模型化 · MoDELS · 可約的 · Networking · Neural Networks ·

2023 年 2 月 27 日

SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks

Rui-Jie Zhu,Qihang Zhao,Jason K. Eshraghian

As the size of large language models continue to scale, so does the computational resources required to run it. Spiking neural networks (SNNs) have emerged as an energy-efficient approach to deep learning that leverage sparse and event-driven activations to reduce the computational overhead associated with model inference. While they have become competitive with non-spiking models on many computer vision tasks, SNNs have also proven to be more challenging to train. As a result, their performance lags behind modern deep learning, and we are yet to see the effectiveness of SNNs in language generation. In this paper, we successfully implement `SpikeGPT', a generative language model with pure binary, event-driven spiking activation units. We train the proposed model on three model variants: 45M, 125M and 260M parameters. To the best of our knowledge, this is 4x larger than any functional backprop-trained SNN to date. We achieve this by modifying the transformer block to replace multi-head self attention to reduce quadratic computational complexity to linear with increasing sequence length. Input tokens are instead streamed in sequentially to our attention mechanism (as with typical SNNs). Our preliminary experiments show that SpikeGPT remains competitive with non-spiking models on tested benchmarks, while maintaining 5x less energy consumption when processed on neuromorphic hardware that can leverage sparse, event-driven activations. Our code implementation is available at //github.com/ridgerchu/SpikeGPT.

Learning · ReQuEST · 泛函 · Extensibility · 深度強化學習 ·

2023 年 2 月 27 日

Dynamic Resource Allocation for Metaverse Applications with Deep Reinforcement Learning

Nam H. Chu,Diep N. Nguyen,Dinh Thai Hoang,Khoa T. Phan,Eryk Dutkiewicz,Dusit Niyato,Tao Shu

from arxiv, To be published in the Proceedings of the IEEE WCNC 2023

This work proposes a novel framework to dynamically and effectively manage and allocate different types of resources for Metaverse applications, which are forecasted to demand massive resources of various types that have never been seen before. Specifically, by studying functions of Metaverse applications, we first propose an effective solution to divide applications into groups, namely MetaInstances, where common functions can be shared among applications to enhance resource usage efficiency. Then, to capture the real-time, dynamic, and uncertain characteristics of request arrival and application departure processes, we develop a semi-Markov decision process-based framework and propose an intelligent algorithm that can gradually learn the optimal admission policy to maximize the revenue and resource usage efficiency for the Metaverse service provider and at the same time enhance the Quality-of-Service for Metaverse users. Extensive simulation results show that our proposed approach can achieve up to 120% greater revenue for the Metaverse service providers and up to 178.9% higher acceptance probability for Metaverse application requests than those of other baselines.

優化器 · 極大 · 峰值 · 通道 · Extensibility ·

2023 年 2 月 26 日

Power Allocation for Uplink Communications of Massive Cellular-Connected UAVs

Xuesong Cai,István Z. Kovács,Jeroen Wigard,Rafhael Amorim,Fredrik Tufvesson,Preben E. Mogensen

from arxiv, The final version can be found in IEEE Transactions on Vehicular Technology

Cellular-connected unmanned aerial vehicle (UAV) has attracted a surge of research interest in both academia and industry. To support aerial user equipment (UEs) in the existing cellular networks, one promising approach is to assign a portion of the system bandwidth exclusively to the UAV-UEs. This is especially favorable for use cases where a large number of UAV-UEs are exploited, e.g., for package delivery close to a warehouse. Although the nearly line-of-sight (LoS) channels can result in higher powers received, UAVs can in turn cause severe interference to each other in the same frequency band. In this contribution, we focus on the uplink communications of massive cellular-connected UAVs. Different power allocation algorithms are proposed to either maximize the minimal spectrum efficiency (SE) or maximize the overall SE to cope with severe interference based on the successive convex approximation (SCA) principle. One of the challenges is that a UAV can affect a large area meaning that many more UAV-UEs must be considered in the optimization problem, which is essentially different from that for terrestrial UEs. The necessity of single-carrier uplink transmission further complicates the problem. Nevertheless, we find that the special property of large coherent bandwidths and coherent times of the propagation channels can be leveraged. The performances of the proposed algorithms are evaluated via extensive simulations in the full-buffer transmission mode and bursty-traffic mode. Results show that the proposed algorithms can effectively enhance the uplink SEs. This work can be considered the first attempt to deal with the interference among massive cellular-connected UAV-UEs with optimized power allocations.

得分 · TEAM · 機器人 · Performer · ONCE ·

2023 年 2 月 26 日

Heterogeneous robot teams with unified perception and autonomy: How Team CSIRO Data61 tied for the top score at the DARPA Subterranean Challenge

Navinda Kottege,Jason Williams,Brendan Tidd,Fletcher Talbot,Ryan Steindl,Mark Cox,Dennis Frousheger,Thomas Hines,Alex Pitt,Benjamin Tam,Brett Wood,Lauren Hanson,Katrina Lo Surdo,Thomas Molnar,Matt Wildie,Kazys Stepanas,Gavin Catt,Lachlan Tychsen-Smith,Dean Penfold,Leslie Overs,Milad Ramezani,Kasra Khosoussi,Farid Kendoul,Glenn Wagner,Duncan Palmer,Jack Manderson,Corey Medek,Matthew O'Brien,Shengkang Chen,Ronald C. Arkin

The DARPA Subterranean Challenge was designed for competitors to develop and deploy teams of autonomous robots to explore difficult unknown underground environments. Categorised in to human-made tunnels, underground urban infrastructure and natural caves, each of these subdomains had many challenging elements for robot perception, locomotion, navigation and autonomy. These included degraded wireless communication, poor visibility due to smoke, narrow passages and doorways, clutter, uneven ground, slippery and loose terrain, stairs, ledges, overhangs, dripping water, and dynamic obstacles that move to block paths among others. In the Final Event of this challenge held in September 2021, the course consisted of all three subdomains. The task was for the robot team to perform a scavenger hunt for a number of pre-defined artefacts within a limited time frame. Only one human supervisor was allowed to communicate with the robots once they were in the course. Points were scored when accurate detections and their locations were communicated back to the scoring server. A total of 8 teams competed in the finals held at the Mega Cavern in Louisville, KY, USA. This article describes the systems deployed by Team CSIRO Data61 that tied for the top score and won second place at the event.

端到端 · Buffer（公司） · 泛函 · 塊 · Performer ·

2023 年 2 月 25 日

Secure End-to-End Communications with Lightweight Cryptographic Algorithm

Augustine Ukpebor,James Addy,Kamal Ali,Ali Abu-El Humos

from arxiv, 14 pages,7 figures, 2 tables, Conference - The 2021 World Congress in Computer Science, Computer Engineering, and Applied Computing (CSCE'21)

The field of lightweight cryptography has been gaining popularity as traditional cryptographic techniques are challenging to implement in resource-limited environments. This research paper presents an approach to utilizing the ESP32 microcontroller as a hardware platform to implement a lightweight cryptographic algorithm. Our approach employs KATAN32, the smallest block cipher of the KATAN family, with an 80-bit key and 32-bit blocks. The algorithm requires less computational power as it employs an 80 unsigned 64-bit integer key for encrypting and decrypting data. During encryption, a data array is passed into the encryption function with a key, which is then used to fill a buffer with an encrypted array. Similarly, the decryption function utilizes a buffer to fill an array of original data in 32 unsigned 64-bit integers. This study also investigates the optimal implementation of cryptography block ciphers, benchmarking performance against various metrics, including memory requirements (RAM), throughput, power consumption, and security. Our implementation demonstrates that data can be securely transmitted end-to-end with good throughput and low power consumption.

有向 · 噪聲 · 深度玻爾茲曼機 · INFORMS · 查準率/準確率 ·

2023 年 2 月 24 日

Practical Considerations in Direct Detection Under Tukey Signalling

Amir Tasbihi,Frank R. Kschischang

from arxiv, Submitted to J. Lightwave Techn. on March 3rd, 2022, revised on October 5th, 2022, accepted on December 1st, 2022

The deliberate introduction of controlled intersymbol interference (ISI) in Tukey signalling enables the recovery of signal amplitude and (in part) signal phase under direct detection, giving rise to significant data rate improvements compared to intensity modulation with direct detection (IMDD). The use of an integrate-and-dump detector makes precise waveform shaping unnecessary, thereby equipping the scheme with a high degree of robustness to nonlinear signal distortions introduced by practical modulators. Signal sequences drawn from star quadrature amplitude modulation (SQAM) formats admit an efficient trellis description that facilitates codebook design and low-complexity near maximum-likelihood sequence detection in the presence of both shot noise and thermal noise. Under the practical (though suboptimal) allocation of a 50% duty cycle between ISI-free and ISI-present signalling segments, at a symbol rate of 50 Gbaud and a launch power of -10 dBm the Tukey scheme has a maximum theoretically achievable throughput of 200 Gb/s with an (8,4)-SQAM constellation, while an IMDD scheme achieves about 145 Gb/s using PAM-8. Note that the two mentioned constellations have the same number of magnitude levels and the difference in throughput is resulting from exploiting phase information under using a complex-valued signal constellation.

優化器 · 全局優化 · 離散化 · Networking · 設計 ·

2023 年 2 月 24 日

Globally Optimal Resource Allocation Design for IRS-Assisted Multiuser Networks with Discrete Phase Shifts

Yifei Wu,Dongfang Xu,Derrick Wing Kwan Ng,Robert Schober,Wolfgang Gerstacker

Intelligent reflecting surfaces (IRSs) are envisioned as a low-cost solution to achieve high spectral and energy efficiency in future communication systems due to their ability to customize wireless propagation environments. Although resource allocation design for IRS-assisted multiuser wireless communication systems has been exhaustively investigated in the literature, the optimal design and performance of such systems are still not well understood. To fill this gap, in this paper, we study optimal resource allocation for IRS-assisted multiuser multiple-input single-output (MISO) systems. In particular, we jointly optimize the beamforming at the base station (BS) and the discrete IRS phase shifts to minimize the total transmit power. For attaining the globally optimal solution of the formulated non-convex combinatorial optimization problem, we develop a resource allocation algorithm with guaranteed convergence based on Schur's complement and the generalized Bender's decomposition. Our numerical results reveal that the proposed algorithm can significantly reduce the BS transmit power compared to the state-of-the-art suboptimal alternating optimization-based approach, especially for moderate-to-large numbers of IRS elements.

層 · 傳感器 · 估計/估計量 · Learning · 圖 ·

2023 年 2 月 24 日

Securing IoT Communication using Physical Sensor Data -- Graph Layer Security with Federated Multi-Agent Deep Reinforcement Learning

Liang Wang,Zhuangkun Wei,Weisi Guo

from arxiv, 6 pages

Internet-of-Things (IoT) devices are often used to transmit physical sensor data over digital wireless channels. Traditional Physical Layer Security (PLS)-based cryptography approaches rely on accurate channel estimation and information exchange for key generation, which irrevocably ties key quality with digital channel estimation quality. Recently, we proposed a new concept called Graph Layer Security (GLS), where digital keys are derived from physical sensor readings. The sensor readings between legitimate users are correlated through a common background infrastructure environment (e.g., a common water distribution network or electric grid). The challenge for GLS has been how to achieve distributed key generation. This paper presents a Federated multi-agent Deep reinforcement learning-assisted Distributed Key generation scheme (FD2K), which fully exploits the common features of physical dynamics to establish secret key between legitimate users. We present for the first time initial experimental results of GLS with federated learning, achieving considerable security performance in terms of key agreement rate (KAR), and key randomness.

優化器 · INTERACT · Networking · 知識 (knowledge) · Performer ·

2022 年 5 月 11 日

Dynamic neighbourhood optimisation for task allocation using multi-agent

Niall Creech,Natalia Criado Pacheco,Simon Miles

from arxiv, 28 pages

In large-scale systems there are fundamental challenges when centralised techniques are used for task allocation. The number of interactions is limited by resource constraints such as on computation, storage, and network communication. We can increase scalability by implementing the system as a distributed task-allocation system, sharing tasks across many agents. However, this also increases the resource cost of communications and synchronisation, and is difficult to scale. In this paper we present four algorithms to solve these problems. The combination of these algorithms enable each agent to improve their task allocation strategy through reinforcement learning, while changing how much they explore the system in response to how optimal they believe their current strategy is, given their past experience. We focus on distributed agent systems where the agents' behaviours are constrained by resource usage limits, limiting agents to local rather than system-wide knowledge. We evaluate these algorithms in a simulated environment where agents are given a task composed of multiple subtasks that must be allocated to other agents with differing capabilities, to then carry out those tasks. We also simulate real-life system effects such as networking instability. Our solution is shown to solve the task allocation problem to 6.7% of the theoretical optimal within the system configurations considered. It provides 5x better performance recovery over no-knowledge retention approaches when system connectivity is impacted, and is tested against systems up to 100 agents with less than a 9% impact on the algorithms' performance.