Deep Q-learning From Demonstrations.
Todd HesterMatej VeceríkOlivier PietquinMarc LanctotTom SchaulBilal PiotDan HorganJohn QuanAndrew SendonarisIan OsbandGabriel Dulac-ArnoldJohn P. AgapiouJoel Z. LeiboAudrunas GruslysPublished in: AAAI (2018)
Keyphrases
- reinforcement learning
- cooperative
- function approximation
- multi agent
- state space
- stochastic approximation
- learning algorithm
- optimal policy
- reinforcement learning algorithms
- model free
- multiagent learning
- decision trees
- learning rate
- neural network
- markov chain
- knowledge base
- temporal difference learning
- continuous state spaces
- bucket brigade
- credit assignment