A Sample Aggregation Approach to Experiences Replay of Dyna-Q Learning.

Haobin Shi Shike Yang Kao-Shing Hwang Jialin Chen Mengkai Hu Heng-sheng Zhang

Published in: IEEE Access (2018)

Keyphrases

function approximation
temporal difference learning
reinforcement learning
cooperative
state space
model free
reinforcement learning methods
temporal difference
learning algorithm
sample size
case study
multi agent
rank aggregation
small sample
learning tasks
aggregation functions
machine learning
sample points
function approximators
data aggregation
optimal policy
past experience
aggregation operators
data samples
action selection
rl algorithms
fixed point
td learning
learning from experience