Distributional Reinforcement Learning for Efficient Exploration.

Borislav Mavrin Shangtong Zhang Hengshuai Yao Linglong Kong Kaiwen Wu Yaoliang Yu

Published in: CoRR (2019)

Keyphrases

reinforcement learning
co occurrence
function approximation
temporal difference learning
model free
reinforcement learning algorithms
temporal difference
markov decision processes
optimal policy
state space
direct policy search
robot control
multi agent
machine learning
evolutionary algorithm
learning process
case study
control problems
neural network
databases
knowledge base
social networks
learning algorithm
real world
stochastic approximation