Distributional Reinforcement Learning for Efficient Exploration.
Borislav MavrinShangtong ZhangHengshuai YaoLinglong KongKaiwen WuYaoliang YuPublished in: CoRR (2019)
Keyphrases
- reinforcement learning
- co occurrence
- function approximation
- temporal difference learning
- model free
- reinforcement learning algorithms
- temporal difference
- markov decision processes
- optimal policy
- state space
- direct policy search
- robot control
- multi agent
- machine learning
- evolutionary algorithm
- learning process
- case study
- control problems
- neural network
- databases
- knowledge base
- social networks
- learning algorithm
- real world
- stochastic approximation