SERA: Sample Efficient Reward Augmentation in offline-to-online Reinforcement Learning.

Ziqi Zhang Xiao Xiong Zifeng Zhuang Jinxin Liu Donglin Wang

Published in: CoRR (2023)

Keyphrases

reinforcement learning
balancing exploration and exploitation
real time
online learning
computationally expensive
multi agent
state space
least squares
cost effective
function approximation
machine learning
website
dynamic programming
markov decision processes
temporal difference
small sample