Login / Signup
Online learning in episodic Markovian decision processes by relative entropy policy search.
Alexander Zimin
Gergely Neu
Published in:
NIPS (2013)
Keyphrases
</>
relative entropy
policy search
reinforcement learning
information theoretic
covariance matrix
reinforcement learning algorithms
dynamic programming
reward function
state space
markov decision problems
learning algorithm
search space
maximum entropy