Environmental statistics and the trade-off between model-based and TD learning in humans.

Dylan A. Simon Nathaniel D. Daw

Published in: NIPS (2011)

Keyphrases

trade off
td learning
temporal difference
evaluation function
model free
function approximation
policy evaluation
reinforcement learning algorithms
reinforcement learning
multi step
monte carlo
learning experience
confidence intervals
multi objective
multiresolution
policy iteration
multi agent