MOORe: Model-based Offline-to-Online Reinforcement Learning.
Yihuan MaoChao WangBin WangChongjie ZhangPublished in: CoRR (2022)
Keyphrases
- reinforcement learning
- real time
- model free
- online learning
- state space
- function approximation
- data driven
- data sets
- online algorithms
- optimal policy
- dynamic programming
- multi agent
- case study
- artificial intelligence
- search space
- transfer learning
- markov decision processes
- temporal difference
- online advertising
- action space
- real world
- temporal difference learning