Enhanced Reinforcement Learning by Recursive Updating of Q-values for Reward Propagation.

Yunsick Sung Eunyoung Ahn Kyungeun Cho

Published in: ICITCS (2012)

Keyphrases

reinforcement learning
learning algorithm
markov decision processes
optimal policy
function approximation
eligibility traces
model free
state space
multi agent reinforcement learning
machine learning
average reward
learning agent
reinforcement learning algorithms
reward function
attribute values
dynamic programming
partially observable environments
neural network
agent receives
recursive algorithm
temporal difference learning
long run
optimal control
hidden markov models
multi agent