Reinforcement Learning Experience Reuse with Policy Residual Representation.

Wen-Ji Zhou Yang Yu Yingfeng Chen Kai Guan Tangjie Lv Changjie Fan Zhi-Hua Zhou

Published in: IJCAI (2019)

Keyphrases

reinforcement learning
optimal policy
policy search
image representation
partially observable
action selection
learning algorithm
representation scheme
reward function
function approximation
user experience
multi agent
markov decision processes
learning objects
supervised learning
information technology
reinforcement learning algorithms
multiscale
markov decision process
action space
state dependent
policy gradient
machine learning
partially observable environments