Convergence Results For Q-Learning With Experience Replay.
Liran SzlakOhad ShamirPublished in: CoRR (2021)
Keyphrases
- stochastic approximation
- reinforcement learning
- function approximation
- stochastic shortest path
- state space
- cooperative
- multi agent
- model free
- neural network
- learning rate
- initial conditions
- convergence proof
- database
- learning environment
- learning algorithm
- optimal policy
- convergence rate
- action selection
- policy iteration
- faster convergence
- real time
- multiagent learning