Action Branching Architectures for Deep Reinforcement Learning.

Arash Tavakoli Fabio Pardo Petar Kormushev

Published in: CoRR (2017)

Keyphrases

reinforcement learning
action selection
action space
partially observable domains
function approximation
reward shaping
learning algorithm
markov decision processes
model free
temporal difference
state action
transition model
optimal control
state space
temporal difference learning
sensory inputs
fitted q iteration
branch and bound
optimal policy
reinforcement learning algorithms
lower bound
genetic algorithm
dynamic programming
multi agent