Learning is planning: near Bayes-optimal reinforcement learning via Monte-Carlo tree search.
John AsmuthMichael L. LittmanPublished in: UAI (2011)
Keyphrases
- reinforcement learning
- learning process
- bayes optimal
- reinforcement learning methods
- learning algorithm
- temporal difference learning
- action selection
- supervised learning
- function approximation
- learning problems
- learning curve
- version space
- monte carlo tree search
- active learning
- learning curves
- distribution free
- hypothesis space
- learning tasks
- dynamic programming
- learning capabilities
- temporal difference
- optimal solution