Enhanced Reinforcement Learning by Recursive Updating of Q-values for Reward Propagation.
Yunsick SungEunyoung AhnKyungeun ChoPublished in: ICITCS (2012)
Keyphrases
- reinforcement learning
- learning algorithm
- markov decision processes
- optimal policy
- function approximation
- eligibility traces
- model free
- state space
- multi agent reinforcement learning
- machine learning
- average reward
- learning agent
- reinforcement learning algorithms
- reward function
- attribute values
- dynamic programming
- partially observable environments
- neural network
- agent receives
- recursive algorithm
- temporal difference learning
- long run
- optimal control
- hidden markov models
- multi agent