GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms.

Cédric Colas Olivier Sigaud Pierre-Yves Oudeyer

Published in: CoRR (2018)

Keyphrases

reinforcement learning algorithms
reinforcement learning
state space
model free
gene expression programming
markov decision processes
temporal difference
eligibility traces
reinforcement learning problems
learning algorithm
reinforcement learning methods
function approximation
dynamic environments
partially observable environments
genetic programming
policy search
action selection
reward function
evolutionary computation
feature selection
optimal policy
function approximators
artificial neural networks
multi agent
multiagent reinforcement learning
machine learning