Targeting Specific Distributions of Trajectories in MDPs.

David L. Roberts Mark J. Nelson Charles Lee Isbell Jr.Michael Mateas Michael L. Littman

Published in: AAAI (2006)

Keyphrases

reinforcement learning
markov decision processes
high level
domain specific
heavy tailed
neural network
search space
probability distribution
state space
higher level
optimal policy
joint distribution
power law
markov decision process
finite horizon