Increasing the Action Gap: New Operators for Reinforcement Learning.

Marc G. Bellemare Georg Ostrovski Arthur Guez Philip S. Thomas Rémi Munos

Published in: CoRR (2015)

Keyphrases

reinforcement learning
action selection
partially observable domains
action space
model free
function approximation
reward shaping
reinforcement learning algorithms
state action
markov decision processes
optimal policy
robotic control
multi agent
temporal difference
learning algorithm
learning capabilities
neural network
transition model
fitted q iteration
continuous state
function approximators
human actions
mobile robot
knowledge base
machine learning