Reinforcement Learning Experience Reuse with Policy Residual Representation.
Wen-Ji ZhouYang YuYingfeng ChenKai GuanTangjie LvChangjie FanZhi-Hua ZhouPublished in: IJCAI (2019)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- image representation
- partially observable
- action selection
- learning algorithm
- representation scheme
- reward function
- function approximation
- user experience
- multi agent
- markov decision processes
- learning objects
- supervised learning
- information technology
- reinforcement learning algorithms
- multiscale
- markov decision process
- action space
- state dependent
- policy gradient
- machine learning
- partially observable environments