Reinforcement Learning Experience Reuse with Policy Residual Representation.
Wen-Ji ZhouYang YuYingfeng ChenKai GuanTangjie LvChangjie FanZhi-Hua ZhouPublished in: CoRR (2019)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- markov decision process
- action selection
- state space
- markov decision problems
- reinforcement learning algorithms
- representation scheme
- partially observable environments
- reinforcement learning problems
- state dependent
- policy iteration
- neural network
- temporal difference
- markov decision processes
- learning process
- learning algorithm