一级欧美一级日韩大片,99久久久无码国产精品69,国产一级精品久久,日韩精品一区二区五月婷,青女视频一区二区在线观看

Simultaneous transmitting and reflecting reconfigurable intelligent surfaces (STAR-RISs) has been considered as a promising auxiliary device to enhance the performance of the wireless network, where users located at the different sides of the surfaces can be simultaneously served by the transmitting and reflecting signals. In this paper, the energy efficiency (EE) maximization problem for a non-orthogonal multiple access (NOMA) assisted STAR-RIS downlink network is investigated. Due to the fractional form of the EE, it is challenging to solve the EE maximization problem by the traditional convex optimization solutions. In this work, a deep deterministic policy gradient (DDPG)-based algorithm is proposed to maximize the EE by jointly optimizing the transmission beamforming vectors at the base station and the coefficients matrices at the STAR-RIS. Simulation results demonstrate that the proposed algorithm can effectively maximize the system EE considering the time-varying channels.

相關內容

Networking

關注 22

Networking：IFIP International Conferences on Networking。 Explanation：國際網絡會議。 Publisher：IFIP。 SIT：

穩健性 · Extensibility · 深度強化學習 · 噪聲 · 深度Q網絡 ·

2022 年 2 月 2 日

Certifiable Robustness to Adversarial State Uncertainty in Deep Reinforcement Learning

Michael Everett,Bjorn Lutjens,Jonathan P. How

from arxiv, arXiv admin note: text overlap with arXiv:1910.12908

Deep Neural Network-based systems are now the state-of-the-art in many robotics tasks, but their application in safety-critical domains remains dangerous without formal guarantees on network robustness. Small perturbations to sensor inputs (from noise or adversarial examples) are often enough to change network-based decisions, which was recently shown to cause an autonomous vehicle to swerve into another lane. In light of these dangers, numerous algorithms have been developed as defensive mechanisms from these adversarial inputs, some of which provide formal robustness guarantees or certificates. This work leverages research on certified adversarial robustness to develop an online certifiably robust for deep reinforcement learning algorithms. The proposed defense computes guaranteed lower bounds on state-action values during execution to identify and choose a robust action under a worst-case deviation in input space due to possible adversaries or noise. Moreover, the resulting policy comes with a certificate of solution quality, even though the true state and optimal action are unknown to the certifier due to the perturbations. The approach is demonstrated on a Deep Q-Network policy and is shown to increase robustness to noise and adversaries in pedestrian collision avoidance scenarios and a classic control task. This work extends one of our prior works with new performance guarantees, extensions to other RL algorithms, expanded results aggregated across more scenarios, an extension into scenarios with adversarial behavior, comparisons with a more computationally expensive method, and visualizations that provide intuition about the robustness algorithm.

Performer · Microsoft Surface · Extensibility · 優化器 · 塊 ·

2022 年 2 月 2 日

An Energy-efficient Aerial Backhaul System with Reconfigurable Intelligent Surface

Hong-Bae Jeon,Sung-Ho Park,Jaedon Park,Kaibin Huang,Chan-Byoung Chae

from arxiv, 17 pages, 15 figures

In this paper, we propose a novel wireless architecture, mounted on a high-altitude aerial platform, which is enabled by reconfigurable intelligent surface (RIS). By installing RIS on the aerial platform, rich line-of-sight and full-area coverage can be achieved, thereby, overcoming the limitations of the conventional terrestrial RIS. We consider a scenario where a sudden increase in traffic in an urban area triggers authorities to rapidly deploy unmanned-aerial vehicle base stations (UAV-BSs) to serve the ground users. In this scenario, since the direct backhaul link from the ground source can be blocked due to several obstacles from the urban area, we propose reflecting the backhaul signal using aerial-RIS so that it successfully reaches the UAV-BSs. We jointly optimize the placement and array-partition strategies of aerial-RIS and the phases of RIS elements, which leads to an increase in energy-efficiency of every UAV-BS. We show that the complexity of our algorithm can be bounded by the quadratic order, thus implying high computational efficiency. We verify the performance of the proposed algorithm via extensive numerical evaluations and show that our method achieves an outstanding performance in terms of energy-efficiency compared to benchmark schemes.

Networking · 優化器 · Performer · Networks · 深度強化學習 ·

2022 年 2 月 1 日

Accelerating Deep Reinforcement Learning for Digital Twin Network Optimization with Evolutionary Strategies

Carlos Güemes-Palau,Paul Almasan,Shihan Xiao,Xiangle Cheng,Xiang Shi,Pere Barlet-Ros,Albert Cabellos-Aparicio

from arxiv, 5 pages, 5 figures

The recent growth of emergent network applications (e.g., satellite networks, vehicular networks) is increasing the complexity of managing modern communication networks. As a result, the community proposed the Digital Twin Networks (DTN) as a key enabler of efficient network management. Network operators can leverage the DTN to perform different optimization tasks (e.g., Traffic Engineering, Network Planning). Deep Reinforcement Learning (DRL) showed a high performance when applied to solve network optimization problems. In the context of DTN, DRL can be leveraged to solve optimization problems without directly impacting the real-world network behavior. However, DRL scales poorly with the problem size and complexity. In this paper, we explore the use of Evolutionary Strategies (ES) to train DRL agents for solving a routing optimization problem. The experimental results show that ES achieved a training time speed-up of 128 and 6 for the NSFNET and GEANT2 topologies respectively.

有向 · Pair · 通道 · Performer · 系統設計 ·

2022 年 2 月 1 日

Full-Duplex Aerial Communication System for Multiple UAVs with Directional Antennas

Tao Yu,Kiyomichi Araki,Kei Sakaguchi

from arxiv, The paper was accepted by IEEE Consumer Communications & Networking Conference (CCNC) 2022

UAV-based wireless systems, such as wireless relay and remote sensing, have attracted great attentions from academia and industry. To realize them, a high-performance wireless aerial communication system, which bridges UAVs and ground stations, is one of the key enablers. However, there are still issues hindering its development, such as the severe co-channel interference among UAVs, and the limited payload/battery-life of UAVs. To address the challenges, we propose an aerial communication system which enables system-level full-duplex communication of multiple UAVs with lower hardware complexities than ideal full-duplex communication systems. In the proposed system, each channel is re-assigned to the uplink and downlink of a pair of UAVs, and each UAV employ a pair of separated channels for its uplink and downlink. The co-channel interference between UAVs that reuse same channels is eliminated by exploiting advantages of UAVs' maneuverability and high-gain directional antennas equipped in UAVs and ground stations, so that dedicated cancellers are not necessary in the proposed system. The system design and performance analysis are given, and the simulation results well agree with the designs.

學成 · 相互獨立的 · 相關系數 · 樣本復雜度 · AIM ·

2022 年 1 月 30 日

Provably Efficient Reinforcement Learning in Decentralized General-Sum Markov Games

Weichao Mao,Tamer Ba?ar

This paper addresses the problem of learning an equilibrium efficiently in general-sum Markov games through decentralized multi-agent reinforcement learning. Given the fundamental difficulty of calculating a Nash equilibrium (NE), we instead aim at finding a coarse correlated equilibrium (CCE), a solution concept that generalizes NE by allowing possible correlations among the agents' strategies. We propose an algorithm in which each agent independently runs optimistic V-learning (a variant of Q-learning) to efficiently explore the unknown environment, while using a stabilized online mirror descent (OMD) subroutine for policy updates. We show that the agents can find an $\epsilon$-approximate CCE in at most $\widetilde{O}( H^6S A /\epsilon^2)$ episodes, where $S$ is the number of states, $A$ is the size of the largest individual action space, and $H$ is the length of an episode. This appears to be the first sample complexity result for learning in generic general-sum Markov games. Our results rely on a novel investigation of an anytime high-probability regret bound for OMD with a dynamic learning rate and weighted regret, which would be of independent interest. One key feature of our algorithm is that it is fully \emph{decentralized}, in the sense that each agent has access to only its local information, and is completely oblivious to the presence of others. This way, our algorithm can readily scale up to an arbitrary number of agents, without suffering from the exponential dependence on the number of agents.

Wireless Networks · 損失函數（機器學習） · 聯邦學習 · 優化器 · 學成 ·

2022 年 1 月 29 日

A Joint Learning and Communications Framework for Federated Learning over Wireless Networks

Mingzhe Chen,Zhaohui Yang,Walid Saad,Changchuan Yin,H. Vincent Poor,Shuguang Cui

from arxiv, This paper has been accepted by IEEE Transactions on Wireless Communications. Our code is available at: //github.com/mzchen0/Wireless-FL

In this paper, the problem of training federated learning (FL) algorithms over a realistic wireless network is studied. In particular, in the considered model, wireless users execute an FL algorithm while training their local FL models using their own data and transmitting the trained local FL models to a base station (BS) that will generate a global FL model and send it back to the users. Since all training parameters are transmitted over wireless links, the quality of the training will be affected by wireless factors such as packet errors and the availability of wireless resources. Meanwhile, due to the limited wireless bandwidth, the BS must select an appropriate subset of users to execute the FL algorithm so as to build a global FL model accurately. This joint learning, wireless resource allocation, and user selection problem is formulated as an optimization problem whose goal is to minimize an FL loss function that captures the performance of the FL algorithm. To address this problem, a closed-form expression for the expected convergence rate of the FL algorithm is first derived to quantify the impact of wireless factors on FL. Then, based on the expected convergence rate of the FL algorithm, the optimal transmit power for each user is derived, under a given user selection and uplink resource block (RB) allocation scheme. Finally, the user selection and uplink RB allocation is optimized so as to minimize the FL loss function. Simulation results show that the proposed joint federated learning and communication framework can reduce the FL loss function value by up to 10% and 16%, respectively, compared to: 1) An optimal user selection algorithm with random resource allocation and 2) a standard FL algorithm with random user selection and resource allocation.

向量化 · Extensibility · 優化器 · GROUP · Performer ·

2022 年 1 月 29 日

Active IRS Aided Multiple Access for Energy-Constrained IoT Systems

Guangji Chen,Qingqing Wu,Chong He,Wen Chen,Jie Tang,Shi Jin

We investigate the fundamental multiple access (MA) scheme in an active intelligent reflecting surface (IRS) aided energy-constrained Internet-of-Things (IoT) system, where an active IRS is deployed to assist the uplink transmission from multiple IoT devices to an access point (AP). Our goal is to maximize the sum throughput by optimizing the IRS beamforming vectors across time and resource allocation. To this end, we first study two typical active IRS aided MA schemes, namely time division multiple access (TDMA) and non-orthogonal multiple access (NOMA), by analytically comparing their achievable sum throughput and proposing corresponding algorithms. Interestingly, we prove that given only one available IRS beamforming vector, the NOMA-based scheme generally achieves a larger throughput than the TDMA-based scheme, whereas the latter can potentially outperform the former if multiple IRS beamforming vectors are available to harness the favorable time selectivity of the IRS. To strike a flexible balance between the system performance and the associated signaling overhead incurred by more IRS beamforming vectors, we then propose a general hybrid TDMA-NOMA scheme with user grouping, where the devices in the same group transmit simultaneously via NOMA while devices in different groups occupy orthogonal time slots. By controlling the number of groups, the hybrid TDMA-NOMA scheme is applicable for any given number of IRS beamforming vectors available. Despite of the non-convexity of the considered optimization problem, we propose an efficient algorithm based on alternating optimization. Simulation results illustrate the practical superiorities of the active IRS over the passive IRS in terms of the coverage extension and supporting multiple energy-limited devices, and demonstrate the effectiveness of our proposed hybrid MA scheme for flexibly balancing the performance-cost tradeoff.

GROUP · 學成 · Networking · 強化學習 · 黑盒子 ·

2022 年 1 月 28 日

Competitive Algorithms and Reinforcement Learning for NOMA in IoT Networks

Zoubeir Mlika,Soumaya Cherkaoui

from arxiv, arXiv admin note: text overlap with arXiv:2002.07957

This paper studies the problem of massive Internet of things (IoT) access in beyond fifth generation (B5G) networks using non-orthogonal multiple access (NOMA) technique. The problem involves massive IoT devices grouping and power allocation in order to respect the low latency as well as the limited operating energy of the IoT devices. The considered objective function, maximizing the number of successfully received IoT packets, is different from the classical sum-rate-related objective functions. The problem is first divided into multiple NOMA grouping subproblems. Then, using competitive analysis, an efficient online competitive algorithm (CA) is proposed to solve each subproblem. Next, to solve the power allocation problem, we propose a new reinforcement learning (RL) framework in which a RL agent learns to use the CA as a black box and combines the obtained solutions to each subproblem to determine the power allocation for each NOMA group. Our simulations results reveal that the proposed innovative RL framework outperforms deep-Q-learning methods and is close-to-optimal.

Networking · 可約的 · 學成 · 深度強化學習 · Continuity ·

2022 年 1 月 27 日

A Deep Reinforcement Learning Approach for Service Migration in MEC-enabled Vehicular Networks

Amine Abouaomar,Zoubeir Mlika,Abderrahime Filali,Soumaya Cherkaoui,Abdellatif Kobbane

Multi-access edge computing (MEC) is a key enabler to reduce the latency of vehicular network. Due to the vehicles mobility, their requested services (e.g., infotainment services) should frequently be migrated across different MEC servers to guarantee their stringent quality of service requirements. In this paper, we study the problem of service migration in a MEC-enabled vehicular network in order to minimize the total service latency and migration cost. This problem is formulated as a nonlinear integer program and is linearized to help obtaining the optimal solution using off-the-shelf solvers. Then, to obtain an efficient solution, it is modeled as a multi-agent Markov decision process and solved by leveraging deep Q learning (DQL) algorithm. The proposed DQL scheme performs a proactive services migration while ensuring their continuity under high mobility constraints. Finally, simulations results show that the proposed DQL scheme achieves close-to-optimal performance.

事件抽取 · 學成 · 逆強化學習 · GAN · 估計/估計量 ·

2018 年 4 月 21 日

Event Extraction with Generative Adversarial Imitation Learning

Tongtao Zhang,Heng Ji

We propose a new method for event extraction (EE) task based on an imitation learning framework, specifically, inverse reinforcement learning (IRL) via generative adversarial network (GAN). The GAN estimates proper rewards according to the difference between the actions committed by the expert (or ground truth) and the agent among complicated states in the environment. EE task benefits from these dynamic rewards because instances and labels yield to various extents of difficulty and the gains are expected to be diverse -- e.g., an ambiguous but correctly detected trigger or argument should receive high gains -- while the traditional RL models usually neglect such differences and pay equal attention on all instances. Moreover, our experiments also demonstrate that the proposed framework outperforms state-of-the-art methods, without explicit feature engineering.