Curious Hierarchical Actor-Critic Reinforcement Learning.

Frank Röder Manfred Eppe Phuong D. H. Nguyen Stefan Wermter

Published in: CoRR (2020)

Keyphrases

actor critic
reinforcement learning
policy gradient
temporal difference
optimal control
reinforcement learning algorithms
approximate dynamic programming
neuro fuzzy
function approximation
gradient method
policy iteration
markov decision processes
average reward
learning algorithm
supervised learning
state space
control problems
model free
policy gradient methods
learning problems
markov decision process
transfer learning
optimal policy
stochastic games
temporal difference learning
rl algorithms
multi agent
neural network