Preferential Experience Collection with Frequency based Intrinsic Reward for Deep Reinforcement Learning.
Hongyin ZhangQiangxing TianDonglin WangKaichen WeiPublished in: ICTAI (2020)
Keyphrases
- reinforcement learning
- function approximation
- reinforcement learning algorithms
- markov decision processes
- learning algorithm
- state space
- reward function
- action selection
- low frequency
- optimal policy
- learning process
- multi agent
- database
- model free
- eligibility traces
- temporal difference
- average reward
- policy search
- inverse reinforcement learning
- optimal control
- user experience
- learning capabilities
- partially observable
- deep learning
- control policy
- transfer learning
- reinforcement learning methods
- supervised learning
- dynamic programming
- knowledge base
- reward shaping