On the Reduction of Variance and Overestimation of Deep Q-Learning.
Mohammed SabryAmr M. A. KhalifaPublished in: CoRR (2019)
Keyphrases
- cooperative
- reinforcement learning
- multi agent
- function approximation
- state space
- learning algorithm
- model free
- trade off
- data mining
- standard deviation
- multi agent reinforcement learning
- stochastic approximation
- action selection
- prediction error
- correlation coefficient
- covariance matrix
- optimal policy
- dynamic programming