Convergence Results For Q-Learning With Experience Replay.

Liran Szlak Ohad Shamir

Published in: CoRR (2021)

Keyphrases

stochastic approximation
reinforcement learning
function approximation
stochastic shortest path
state space
cooperative
multi agent
model free
neural network
learning rate
initial conditions
convergence proof
database
learning environment
learning algorithm
optimal policy
convergence rate
action selection
policy iteration
faster convergence
real time
multiagent learning