嘘嘘中国免费观看网站_亚洲精品无码黄色网站在线观看_中文字幕日韩精品一区至六区_国产男女激情一区二区_免费网站国产片永久免费观看_欧美成A人高清欧_偷拍初高中女奶头动态图图片

Deep Reinforcement Learning (DRL) and Evolution Strategies (ESs) have surpassed human-level control in many sequential decision-making problems, yet many open challenges still exist. To get insights into the strengths and weaknesses of DRL versus ESs, an analysis of their respective capabilities and limitations is provided. After presenting their fundamental concepts and algorithms, a comparison is provided on key aspects such as scalability, exploration, adaptation to dynamic environments, and multi-agent learning. Then, the benefits of hybrid algorithms that combine concepts from DRL and ESs are highlighted. Finally, to have an indication about how they compare in real-world applications, a survey of the literature for the set of applications they support is provided.

相關內容

深(shen)度強化(hua)學習(xi)

關注 154

深(shen)(shen)度(du)強(qiang)(qiang)化(hua)(hua)學(xue)(xue)(xue)(xue)習(xi)(xi)(xi)(xi)(xi) (DRL) 是一種(zhong)使(shi)用(yong)深(shen)(shen)度(du)學(xue)(xue)(xue)(xue)習(xi)(xi)(xi)(xi)(xi)技術(shu)擴展傳統強(qiang)(qiang)化(hua)(hua)學(xue)(xue)(xue)(xue)習(xi)(xi)(xi)(xi)(xi)方(fang)法的(de)(de)(de)(de)一種(zhong)機(ji)器(qi)學(xue)(xue)(xue)(xue)習(xi)(xi)(xi)(xi)(xi)方(fang)法。傳統強(qiang)(qiang)化(hua)(hua)學(xue)(xue)(xue)(xue)習(xi)(xi)(xi)(xi)(xi)方(fang)法的(de)(de)(de)(de)主要任務是使(shi)得主體根(gen)據從環境中(zhong)獲得的(de)(de)(de)(de)獎(jiang)賞能(neng)(neng)(neng)夠學(xue)(xue)(xue)(xue)習(xi)(xi)(xi)(xi)(xi)到最大化(hua)(hua)獎(jiang)賞的(de)(de)(de)(de)行為。然(ran)而(er)，傳統無(wu)模型強(qiang)(qiang)化(hua)(hua)學(xue)(xue)(xue)(xue)習(xi)(xi)(xi)(xi)(xi)方(fang)法需要使(shi)用(yong)函(han)(han)數(shu)(shu)逼近(jin)技術(shu)使(shi)得主體能(neng)(neng)(neng)夠學(xue)(xue)(xue)(xue)習(xi)(xi)(xi)(xi)(xi)出(chu)值函(han)(han)數(shu)(shu)或(huo)者策略。在這種(zhong)情況下，深(shen)(shen)度(du)學(xue)(xue)(xue)(xue)習(xi)(xi)(xi)(xi)(xi)強(qiang)(qiang)大的(de)(de)(de)(de)函(han)(han)數(shu)(shu)逼近(jin)能(neng)(neng)(neng)力自然(ran)成為了(le)替代人工指定特征的(de)(de)(de)(de)最好(hao)手段并為性(xing)能(neng)(neng)(neng)更(geng)好(hao)的(de)(de)(de)(de)端(duan)到端(duan)學(xue)(xue)(xue)(xue)習(xi)(xi)(xi)(xi)(xi)的(de)(de)(de)(de)實現提供了(le)可能(neng)(neng)(neng)。

推薦系統 · 學成 · 強化學習 · 策略搜索 · INTERACT ·

2021 年 9 月 22 日

A Survey on Reinforcement Learning for Recommender Systems

Yuanguo Lin,Yong Liu,Fan Lin,Pengcheng Wu,Wenhua Zeng,Chunyan Miao

from arxiv, 25 pages, 4 figures

Recommender systems have been widely applied in different real-life scenarios to help us find useful information. Recently, Reinforcement Learning (RL) based recommender systems have become an emerging research topic. It often surpasses traditional recommendation models even most deep learning-based methods, owing to its interactive nature and autonomous learning ability. Nevertheless, there are various challenges of RL when applying in recommender systems. Toward this end, we firstly provide a thorough overview, comparisons, and summarization of RL approaches for five typical recommendation scenarios, following three main categories of RL: value-function, policy search, and Actor-Critic. Then, we systematically analyze the challenges and relevant solutions on the basis of existing literature. Finally, under discussion for open issues of RL and its limitations of recommendation, we highlight some potential research directions in this field.

INFORMS · 學成 · 可辨認的 · INTERACT · Guidance ·

2021 年 9 月 15 日

Exploration in Deep Reinforcement Learning: A Comprehensive Survey

Tianpei Yang,Hongyao Tang,Chenjia Bai,Jinyi Liu,Jianye Hao,Zhaopeng Meng,Peng Liu

from arxiv, Repolishment is made, revise some incorrect descriptions

Deep Reinforcement Learning (DRL) and Deep Multi-agent Reinforcement Learning (MARL) have achieved significant success across a wide range of domains, such as game AI, autonomous vehicles, robotics and finance. However, DRL and deep MARL agents are widely known to be sample-inefficient and millions of interactions are usually needed even for relatively simple game settings, thus preventing the wide application in real-industry scenarios. One bottleneck challenge behind is the well-known exploration problem, i.e., how to efficiently explore the unknown environments and collect informative experiences that could benefit the policy learning most. In this paper, we conduct a comprehensive survey on existing exploration methods in DRL and deep MARL for the purpose of providing understandings and insights on the critical problems and solutions. We first identify several key challenges to achieve efficient exploration, which most of the exploration methods aim at addressing. Then we provide a systematic survey of existing approaches by classifying them into two major categories: uncertainty-oriented exploration and intrinsic motivation-oriented exploration. The essence of uncertainty-oriented exploration is to leverage the quantification of the epistemic and aleatoric uncertainty to derive efficient exploration. By contrast, intrinsic motivation-oriented exploration methods usually incorporate different reward agnostic information for intrinsic exploration guidance. Beyond the above two main branches, we also conclude other exploration methods which adopt sophisticated techniques but are difficult to be classified into the above two categories. In addition, we provide a comprehensive empirical comparison of exploration methods for DRL on a set of commonly used benchmarks. Finally, we summarize the open problems of exploration in DRL and deep MARL and point out a few future directions.

contrastive · 學成 · 表示學習 · Performer · Better ·

2021 年 3 月 20 日

Self-supervised Learning: Generative or Contrastive

Xiao Liu,Fanjin Zhang,Zhenyu Hou,Zhaoyu Wang,Li Mian,Jing Zhang,Jie Tang

from arxiv, 24 pages, 19 figures

Deep supervised learning has achieved great success in the last decade. However, its deficiencies of dependence on manual labels and vulnerability to attacks have driven people to explore a better solution. As an alternative, self-supervised learning attracts many researchers for its soaring performance on representation learning in the last several years. Self-supervised representation learning leverages input data itself as supervision and benefits almost all types of downstream tasks. In this survey, we take a look into new self-supervised learning methods for representation in computer vision, natural language processing, and graph learning. We comprehensively review the existing empirical methods and summarize them into three main categories according to their objectives: generative, contrastive, and generative-contrastive (adversarial). We further investigate related theoretical analysis work to provide deeper thoughts on how self-supervised learning works. Finally, we briefly discuss open problems and future directions for self-supervised learning. An outline slide for the survey is provided.

遷移學習 · 學成 · state-of-the-art · Boosting（一種模型訓練加速方式） · FAST ·

2020 年 9 月 16 日

Transfer Learning in Deep Reinforcement Learning: A Survey

Zhuangdi Zhu,Kaixiang Lin,Jiayu Zhou

This paper surveys the field of transfer learning in the problem setting of Reinforcement Learning (RL). RL has been the key solution to sequential decision-making problems. Along with the fast advance of RL in various domains. including robotics and game-playing, transfer learning arises as an important technique to assist RL by leveraging and transferring external expertise to boost the learning process. In this survey, we review the central issues of transfer learning in the RL domain, providing a systematic categorization of its state-of-the-art techniques. We analyze their goals, methodologies, applications, and the RL frameworks under which these transfer learning techniques would be approachable. We discuss the relationship between transfer learning and other relevant topics from an RL perspective and also explore the potential challenges as well as future development directions for transfer learning in RL.

強化學習 · 學成 · tuning · 回合 · 有向 ·

2020 年 1 月 19 日

A Survey of Reinforcement Learning Techniques: Strategies, Recent Development, and Future Directions

Amit Kumar Mondal,Nadeem Jamali

Reinforcement learning is one of the core components in designing an artificial intelligent system emphasizing real-time response. Reinforcement learning influences the system to take actions within an arbitrary environment either having previous knowledge about the environment model or not. In this paper, we present a comprehensive study on Reinforcement Learning focusing on various dimensions including challenges, the recent development of different state-of-the-art techniques, and future directions. The fundamental objective of this paper is to provide a framework for the presentation of available methods of reinforcement learning that is informative enough and simple to follow for the new researchers and academics in this domain considering the latest concerns. First, we illustrated the core techniques of reinforcement learning in an easily understandable and comparable way. Finally, we analyzed and depicted the recent developments in reinforcement learning approaches. My analysis pointed out that most of the models focused on tuning policy values rather than tuning other things in a particular state of reasoning.

學成 · 深度強化學習 · 強化學習 · 樣本復雜度 · Atari ·

2019 年 1 月 10 日

Accelerated Methods for Deep Reinforcement Learning

Adam Stooke,Pieter Abbeel

from arxiv, v2: -Added game performance statistics summary for algorithm scaling across full Atari game set. -Added full set of learning curves (appendix). -Fixed images to remove phantom borders. -Streamlined some discussion, moved some details to appendix

Deep reinforcement learning (RL) has achieved many recent successes, yet experiment turn-around time remains a key bottleneck in research and in practice. We investigate how to optimize existing deep RL algorithms for modern computers, specifically for a combination of CPUs and GPUs. We confirm that both policy gradient and Q-value learning algorithms can be adapted to learn using many parallel simulator instances. We further find it possible to train using batch sizes considerably larger than are standard, without negatively affecting sample complexity or final performance. We leverage these facts to build a unified framework for parallelization that dramatically hastens experiments in both classes of algorithm. All neural network computations use GPUs, accelerating both data collection and training. Our results include using an entire DGX-1 to learn successful strategies in Atari games in mere minutes, using both synchronous and asynchronous algorithms.

學成 · 強化學習 · 深度強化學習 · Continuity · Performer ·

2018 年 12 月 31 日

Deep Reinforcement Learning for Multi-Agent Systems: A Review of Challenges, Solutions and Applications

Thanh Thi Nguyen,Ngoc Duy Nguyen,Saeid Nahavandi

from arxiv, 24 pages, 11 figures

Reinforcement learning (RL) algorithms have been around for decades and been employed to solve various sequential decision-making problems. These algorithms however have faced great challenges when dealing with high-dimensional environments. The recent development of deep learning has enabled RL methods to drive optimal policies for sophisticated and capable agents, which can perform efficiently in these challenging environments. This paper addresses an important aspect of deep RL related to situations that demand multiple agents to communicate and cooperate to solve complex tasks. A survey of different approaches to problems related to multi-agent deep RL (MADRL) is presented, including non-stationarity, partial observability, continuous state and action spaces, multi-agent training schemes, multi-agent transfer learning. The merits and demerits of the reviewed methods will be analyzed and discussed, with their corresponding applications explored. It is envisaged that this review provides insights about various MADRL methods and can lead to future development of more robust and highly useful multi-agent learning methods for solving real-world problems.

圖 · 學成 · Neural Networks · 深度學習 · Networking ·

2018 年 12 月 11 日

Deep Learning on Graphs: A Survey

Ziwei Zhang,Peng Cui,Wenwu Zhu

from arxiv, 15 pages, 10 figures

Deep learning has been shown successful in a number of domains, ranging from acoustics, images to natural language processing. However, applying deep learning to the ubiquitous graph data is non-trivial because of the unique characteristics of graphs. Recently, a significant amount of research efforts have been devoted to this area, greatly advancing graph analyzing techniques. In this survey, we comprehensively review different kinds of deep learning methods applied to graphs. We divide existing methods into three main categories: semi-supervised methods including Graph Neural Networks and Graph Convolutional Networks, unsupervised methods including Graph Autoencoders, and recent advancements including Graph Recurrent Neural Networks and Graph Reinforcement Learning. We then provide a comprehensive overview of these methods in a systematic manner following their history of developments. We also analyze the differences of these methods and how to composite different architectures. Finally, we briefly outline their applications and discuss potential future directions.

深度強化學習 · 學成 · 強化學習 · 泛化理論 · BASIC ·

2018 年 12 月 3 日

An Introduction to Deep Reinforcement Learning

Vincent Francois-Lavet,Peter Henderson,Riashat Islam,Marc G. Bellemare,Joelle Pineau

Deep reinforcement learning is the combination of reinforcement learning (RL) and deep learning. This field of research has been able to solve a wide range of complex decision-making tasks that were previously out of reach for a machine. Thus, deep RL opens up many new applications in domains such as healthcare, robotics, smart grids, finance, and many more. This manuscript provides an introduction to deep reinforcement learning models, algorithms and techniques. Particular focus is on the aspects related to generalization and how deep RL can be used for practical applications. We assume the reader is familiar with basic machine learning concepts.

學成 · 深度Q網絡 · Q網絡` · 價值函數 · 學習的學習 ·

2018 年 11 月 26 日

Deep Reinforcement Learning: An Overview

Yuxi Li

from arxiv, Please see Deep Reinforcement Learning, arXiv:1810.06339, for a significant update

We give an overview of recent exciting achievements of deep reinforcement learning (RL). We discuss six core elements, six important mechanisms, and twelve applications. We start with background of machine learning, deep learning and reinforcement learning. Next we discuss core RL elements, including value function, in particular, Deep Q-Network (DQN), policy, reward, model, planning, and exploration. After that, we discuss important mechanisms for RL, including attention and memory, unsupervised learning, transfer learning, multi-agent RL, hierarchical RL, and learning to learn. Then we discuss various applications of RL, including games, in particular, AlphaGo, robotics, natural language processing, including dialogue systems, machine translation, and text generation, computer vision, neural architecture design, business management, finance, healthcare, Industry 4.0, smart grid, intelligent transportation systems, and computer systems. We mention topics not reviewed yet, and list a collection of RL resources. After presenting a brief summary, we close with discussions. Please see Deep Reinforcement Learning, arXiv:1810.06339, for a significant update.