Stabilizing deep Q-learning with Q-graph-based bounds.
Sabrina HoppeMarkus GiftthalerRobert KrugMarc ToussaintPublished in: Int. J. Robotics Res. (2023)
Keyphrases
- upper bound
- reinforcement learning
- cooperative
- function approximation
- upper and lower bounds
- multi agent
- lower bound
- state space
- learning algorithm
- graph model
- worst case
- learning rate
- action selection
- optimal policy
- bucket brigade
- tight bounds
- lower and upper bounds
- dynamic programming
- data sets
- average case
- reinforcement learning algorithms
- temporal difference learning
- semi supervised
- stochastic shortest path
- long run
- vc dimension
- model free
- temporal difference
- stochastic approximation
- confidence bounds
- error bounds