Login / Signup
Optimistic initialization and greediness lead to polynomial time learning in factored MDPs.
Istvan Szita
András Lörincz
Published in:
ICML (2009)
Keyphrases
</>
learning algorithm
learning tasks
factored mdps
learning process
reinforcement learning
supervised learning
decision making
multi agent
special case
domain independent