Unifying Value Iteration, Advantage Learning, and Dynamic Policy Programming.
Tadashi KozunoEiji UchibeKenji DoyaPublished in: CoRR (2017)
Keyphrases
- online learning
- learning process
- computer programming
- programming language
- learning algorithm
- learning systems
- active learning
- knowledge acquisition
- optimal policy
- markov decision processes
- learning tasks
- unsupervised learning
- programming education
- introductory programming
- inductive inference
- learning analytics
- supervised learning
- prior knowledge
- multi agent
- e learning