On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems.

Huizhen Yu Dimitri P. Bertsekas

Published in: Math. Oper. Res. (2013)

Keyphrases

shortest path problem
stochastic approximation
shortest path
single source
reinforcement learning
combinatorial optimization problems
interval data
multiple objectives
cooperative
state space
multi agent
function approximation
directed graph
monte carlo
learning algorithm
directed acyclic graph
continuous state spaces
action selection
learning automata
learning rate
potential field
dynamic programming
neural network
machine learning