顾美玲国产一区二区三区_国产亚洲欧美日韩精品色狠二区_日韩欧美亚洲一区二区久久_亚洲国产片精品一区二区三区_亚洲午夜久久久九九精品_天天躁天天狠天天透_制服丝袜中文字码

The paper describes an online deep learning algorithm (ODL) for adaptive modulation and coding in massive MIMO. The algorithm is based on a fully connected neural network, which is initially trained on the output of the traditional algorithm and then incrementally retrained by the service feedback of its output. We show the advantage of our solution over the state-of-the-art Q-learning approach. We provide system-level simulation results to support this conclusion in various scenarios with different channel characteristics and different user speeds. Compared with traditional OLLA, the algorithm shows a 10\% to 20\% improvement in user throughput in the full-buffer case.

相關內容

學成

關注 0

正則化項 · 線性的 · 基 · state-of-the-art · 奇異值分解 ·

2022 年 1 月 31 日

Adaptive Regularized Zero-Forcing Precoding for Massive MIMO Systems with Multi-Antenna Users

Evgeny Bobrov,Boris Chinyaev,Viktor Kuznetsov,Hao Lu,Dmitrii Minenkov,Sergey Troshin,Daniil Yudakov,Danila Zaev

from arxiv, 25 pages, 6 figures, 4 tables, comments are welcome

Modern wireless cellular networks use massive multiple-input multiple-output (MIMO) technology. This technology involves operations with an antenna array at a base station that simultaneously serves multiple mobile devices which also use multiple antennas on their side. For this, various precoding and detection techniques are used, allowing each user to receive the signal intended for him from the base station. There is an important class of linear precoding called Regularized Zero-Forcing (RZF). In this work, we propose Adaptive RZF (ARZF) with a special kind of regularization matrix with different coefficients for each layer of multi-antenna users. These regularization coefficients are defined by explicit formulas based on SVD decompositions of user channel matrices. We study the optimization problem, which is solved by the proposed algorithm, with the connection to other possible problem statements. We also compare the proposed algorithm with state-of-the-art linear precoding algorithms on simulations with the Quadriga channel model. The proposed approach provides a significant increase in quality with the same computation time as in the reference methods.

Networking · 可約的 · 學成 · 深度強化學習 · Continuity ·

2022 年 1 月 27 日

A Deep Reinforcement Learning Approach for Service Migration in MEC-enabled Vehicular Networks

Amine Abouaomar,Zoubeir Mlika,Abderrahime Filali,Soumaya Cherkaoui,Abdellatif Kobbane

Multi-access edge computing (MEC) is a key enabler to reduce the latency of vehicular network. Due to the vehicles mobility, their requested services (e.g., infotainment services) should frequently be migrated across different MEC servers to guarantee their stringent quality of service requirements. In this paper, we study the problem of service migration in a MEC-enabled vehicular network in order to minimize the total service latency and migration cost. This problem is formulated as a nonlinear integer program and is linearized to help obtaining the optimal solution using off-the-shelf solvers. Then, to obtain an efficient solution, it is modeled as a multi-agent Markov decision process and solved by leveraging deep Q learning (DQL) algorithm. The proposed DQL scheme performs a proactive services migration while ensuring their continuity under high mobility constraints. Finally, simulations results show that the proposed DQL scheme achieves close-to-optimal performance.

Networking · 深度強化學習 · 優化器 · 強化學習 · 學成 ·

2022 年 1 月 27 日

Network Slicing with MEC and Deep Reinforcement Learning for the Internet of Vehicles

Zoubeir Mlika,Soumaya Cherkaoui

The interconnection of vehicles in the future fifth generation (5G) wireless ecosystem forms the so-called Internet of vehicles (IoV). IoV offers new kinds of applications requiring delay-sensitive, compute-intensive and bandwidth-hungry services. Mobile edge computing (MEC) and network slicing (NS) are two of the key enabler technologies in 5G networks that can be used to optimize the allocation of the network resources and guarantee the diverse requirements of IoV applications. As traditional model-based optimization techniques generally end up with NP-hard and strongly non-convex and non-linear mathematical programming formulations, in this paper, we introduce a model-free approach based on deep reinforcement learning (DRL) to solve the resource allocation problem in MEC-enabled IoV network based on network slicing. Furthermore, the solution uses non-orthogonal multiple access (NOMA) to enable a better exploitation of the scarce channel resources. The considered problem addresses jointly the channel and power allocation, the slice selection and the vehicles selection (vehicles grouping). We model the problem as a single-agent Markov decision process. Then, we solve it using DRL using the well-known DQL algorithm. We show that our approach is robust and effective under different network conditions compared to benchmark solutions.

CASES · CC · Networking · 學成 · 在線 ·

2022 年 1 月 27 日

Massive IoT Access With NOMA in 5G Networks and Beyond Using Online Competitiveness and Learning

Zoubeir Mlika,Soumaya Cherkaoui

This paper studies the problem of online user grouping, scheduling and power allocation in beyond 5G cellular-based Internet of things networks. Due to the massive number of devices trying to be granted to the network, non-orthogonal multiple access method is adopted in order to accommodate multiple devices in the same radio resource block. Different from most previous works, the objective is to maximize the number of served devices while allocating their transmission powers such that their real-time requirements as well as their limited operating energy are respected. First, we formulate the general problem as a mixed integer non-linear program (MINLP) that can be transformed easily to MILP for some special cases. Second, we study its computational complexity by characterizing the NP-hardness of different special cases. Then, by dividing the problem into multiple NOMA grouping and scheduling subproblems, efficient online competitive algorithms are proposed. Further, we show how to use these online algorithms and combine their solutions in a reinforcement learning setting to obtain the power allocation and hence the global solution to the problem. Our analysis are supplemented by simulation results to illustrate the performance of the proposed algorithms with comparison to optimal and state-of-the-art methods.

Weight · 文本分類 · 學成 · 訓練數據 · 泛函 ·

2019 年 3 月 28 日

Learning to Weight for Text Classification

Alejandro Moreo Fernández,Andrea Esuli,Fabrizio Sebastiani

from arxiv, To appear in IEEE Transactions on Knowledge and Data Engineering

In information retrieval (IR) and related tasks, term weighting approaches typically consider the frequency of the term in the document and in the collection in order to compute a score reflecting the importance of the term for the document. In tasks characterized by the presence of training data (such as text classification) it seems logical that the term weighting function should take into account the distribution (as estimated from training data) of the term across the classes of interest. Although `supervised term weighting' approaches that use this intuition have been described before, they have failed to show consistent improvements. In this article we analyse the possible reasons for this failure, and call consolidated assumptions into question. Following this criticism we propose a novel supervised term weighting approach that, instead of relying on any predefined formula, learns a term weighting function optimised on the training set of interest; we dub this approach \emph{Learning to Weight} (LTW). The experiments that we run on several well-known benchmarks, and using different learning methods, show that our method outperforms previous term weighting approaches in text classification.

機器人操作平臺 · 強化學習 · 學成 · CASES · 機器人 ·

2019 年 3 月 14 日

gym-gazebo2, a toolkit for reinforcement learning using ROS 2 and Gazebo

Nestor Gonzalez Lopez,Yue Leire Erro Nuin,Elias Barba Moral,Lander Usategui San Juan,Alejandro Solano Rueda,Víctor Mayoral Vilches,Risto Kojcev

This paper presents an upgraded, real world application oriented version of gym-gazebo, the Robot Operating System (ROS) and Gazebo based Reinforcement Learning (RL) toolkit, which complies with OpenAI Gym. The content discusses the new ROS 2 based software architecture and summarizes the results obtained using Proximal Policy Optimization (PPO). Ultimately, the output of this work presents a benchmarking system for robotics that allows different techniques and algorithms to be compared using the same virtual conditions. We have evaluated environments with different levels of complexity of the Modular Articulated Robotic Arm (MARA), reaching accuracies in the millimeter scale. The converged results show the feasibility and usefulness of the gym-gazebo 2 toolkit, its potential and applicability in industrial use cases, using modular robots.

坐標下降 · 優化器 · Performer · 學成 · 在線 ·

2018 年 7 月 16 日

Accelerated Randomized Coordinate Descent Algorithms for Stochastic Optimization and Online Learning

Akshita Bhandari,Chandramani Singh

from arxiv, 20 pages, 4 figures, 2 tables

We propose accelerated randomized coordinate descent algorithms for stochastic optimization and online learning. Our algorithms have significantly less per-iteration complexity than the known accelerated gradient algorithms. The proposed algorithms for online learning have better regret performance than the known randomized online coordinate descent algorithms. Furthermore, the proposed algorithms for stochastic optimization exhibit as good convergence rates as the best known randomized coordinate descent algorithms. We also show simulation results to demonstrate performance of the proposed algorithms.

Extensibility · Networking · 深度強化學習 · INFORMS · state-of-the-art ·

2018 年 7 月 9 日

Video Summarisation by Classification with Deep Reinforcement Learning

Kaiyang Zhou,Tao Xiang,Andrea Cavallaro

from arxiv, To appear in BMVC 2018

Most existing video summarisation methods are based on either supervised or unsupervised learning. In this paper, we propose a reinforcement learning-based weakly supervised method that exploits easy-to-obtain, video-level category labels and encourages summaries to contain category-related information and maintain category recognisability. Specifically, We formulate video summarisation as a sequential decision-making process and train a summarisation network with deep Q-learning (DQSN). A companion classification network is also trained to provide rewards for training the DQSN. With the classification network, we develop a global recognisability reward based on the classification result. Critically, a novel dense ranking-based reward is also proposed in order to cope with the temporally delayed and sparse reward problems for long sequence reinforcement learning. Extensive experiments on two benchmark datasets show that the proposed approach achieves state-of-the-art performance.

學成 · 優化器 · Extensibility · MoDELS · Next ·

2018 年 3 月 23 日

Learning Recommendations While Influencing Interests

Rahul Meshram,D. Manjunath,Nikhil Karamchandani

from arxiv, 13 pages, submitted to conference

Personalized recommendation systems (RS) are extensively used in many services. Many of these are based on learning algorithms where the RS uses the recommendation history and the user response to learn an optimal strategy. Further, these algorithms are based on the assumption that the user interests are rigid. Specifically, they do not account for the effect of learning strategy on the evolution of the user interests. In this paper we develop influence models for a learning algorithm that is used to optimally recommend websites to web users. We adapt the model of \cite{Ioannidis10} to include an item-dependent reward to the RS from the suggestions that are accepted by the user. For this we first develop a static optimisation scheme when all the parameters are known. Next we develop a stochastic approximation based learning scheme for the RS to learn the optimal strategy when the user profiles are not known. Finally, we describe several user-influence models for the learning algorithm and analyze their effect on the steady user interests and on the steady state optimal strategy as compared to that when the users are not influenced.

2018 年 1 月 5 日

Deep Reinforcement Learning for List-wise Recommendations

Xiangyu Zhao,Liang Zhang,Zhuoye Ding,Dawei Yin,Yihong Zhao,Jiliang Tang

Recommender systems play a crucial role in mitigating the problem of information overload by suggesting users' personalized items or services. The vast majority of traditional recommender systems consider the recommendation procedure as a static process and make recommendations following a fixed strategy. In this paper, we propose a novel recommender system with the capability of continuously improving its strategies during the interactions with users. We model the sequential interactions between users and a recommender system as a Markov Decision Process (MDP) and leverage Reinforcement Learning (RL) to automatically learn the optimal strategies via recommending trial-and-error items and receiving reinforcements of these items from users' feedbacks. In particular, we introduce an online user-agent interacting environment simulator, which can pre-train and evaluate model parameters offline before applying the model online. Moreover, we validate the importance of list-wise recommendations during the interactions between users and agent, and develop a novel approach to incorporate them into the proposed framework LIRD for list-wide recommendations. The experimental results based on a real-world e-commerce dataset demonstrate the effectiveness of the proposed framework.