MOORe: Model-based Offline-to-Online Reinforcement Learning.

Yihuan Mao Chao Wang Bin Wang Chongjie Zhang

Published in: CoRR (2022)

Keyphrases

reinforcement learning
real time
model free
online learning
state space
function approximation
data driven
data sets
online algorithms
optimal policy
dynamic programming
multi agent
case study
artificial intelligence
search space
transfer learning
markov decision processes
temporal difference
online advertising
action space
real world
temporal difference learning