A Reduction from Reinforcement Learning to No-Regret Online Learning.
Ching-An ChengRemi Tachet des CombesByron BootsGeoffrey J. GordonPublished in: CoRR (2019)
Keyphrases
- online learning
- reinforcement learning
- computer mediated
- higher education
- online course
- e learning
- dynamic programming
- distance education
- online algorithms
- model free
- function approximation
- markov decision processes
- active learning
- distance learning
- learning algorithm
- transfer learning
- state space
- online learning environments
- temporal difference
- neural network
- online convex optimization
- blended learning
- optimal control
- learning process
- machine learning
- learning tasks
- action space
- multi agent