Learning to Play Pac-Xon with Q-Learning and Two Double Q-Learning Variants.

Jits Schilperoort Ivar Mak Madalina M. Drugan Marco A. Wiering

Published in: SSCI (2018)

Keyphrases

learning algorithm
reinforcement learning
cooperative
action selection
function approximation
temporal difference learning
multiagent learning
state space
reinforcement learning algorithms
learning agent
temporal difference methods
reinforcement learning methods
noise tolerant
learning systems
learning process
e learning
learning problems
model free
learning tasks
efficient learning
supervised learning
upper bound
mobile robot
active learning
multi agent reinforcement learning
relational reinforcement learning
credit assignment