Safely Bridging Offline and Online Reinforcement Learning.
Wanqiao XuKan XuHamsa BastaniOsbert BastaniPublished in: CoRR (2021)
Keyphrases
- reinforcement learning
- real time
- online learning
- function approximation
- machine learning
- multi agent
- dynamic programming
- state space
- learning process
- artificial neural networks
- learning algorithm
- database
- search engine
- markov decision processes
- temporal difference
- online advertising
- online environment
- online resources