Login / Signup
Q-Learning for MDPs with General Spaces: Convergence and Near Optimality via Quantization under Weak Continuity.
Ali Devran Kara
Naci Saldi
Serdar Yüksel
Published in:
CoRR (2021)
Keyphrases
</>
reinforcement learning
stochastic shortest path
special case
state space
markov decision processes
learning algorithm
optimal policy
stochastic approximation
function approximation
cooperative
monte carlo
average cost
image compression
policy iteration
markov decision problems
state and action spaces