A Sample Aggregation Approach to Experiences Replay of Dyna-Q Learning.
Haobin ShiShike YangKao-Shing HwangJialin ChenMengkai HuHeng-sheng ZhangPublished in: IEEE Access (2018)
Keyphrases
- function approximation
- temporal difference learning
- reinforcement learning
- cooperative
- state space
- model free
- reinforcement learning methods
- temporal difference
- learning algorithm
- sample size
- case study
- multi agent
- rank aggregation
- small sample
- learning tasks
- aggregation functions
- machine learning
- sample points
- function approximators
- data aggregation
- optimal policy
- past experience
- aggregation operators
- data samples
- action selection
- rl algorithms
- fixed point
- td learning
- learning from experience