Internally Driven Q-learning - Convergence and Generalization Results.

Eduardo Alonso Esther Mondragón Niclas Kjäll-Ohlsson

Published in: ICAART (1) (2012)

Keyphrases

stochastic approximation
reinforcement learning
stochastic shortest path
convergence proof
function approximation
data driven
learning algorithm
cooperative
convergence rate
multi agent
state space
dynamic programming
model free
learning rate
faster convergence
action selection
optimal policy
decision trees
bucket brigade
learning tasks
convergence speed
real time
monte carlo
data sets