Login / Signup
Generalized exploration in policy search.
Herke van Hoof
Daniel Tanneberg
Jan Peters
Published in:
Mach. Learn. (2017)
Keyphrases
</>
policy search
reinforcement learning
dynamic programming
continuous action
continuous state
reinforcement learning algorithms
markov decision problems
reward function
policy gradient