Towards Characterizing Divergence in Deep Q-Learning.

Joshua Achiam Ethan Knight Pieter Abbeel

Published in: CoRR (2019)

Keyphrases

reinforcement learning
function approximation
state space
learning algorithm
cooperative
action selection
multi agent
learning rate
stochastic approximation
reinforcement learning algorithms
model free
data sets
bucket brigade
database
optimal policy
sufficient conditions
multi agent reinforcement learning
fixed point
markov decision processes
information theoretic
dynamic programming
deep learning
kl divergence
relative entropy
neural network
potential field
hierarchical reinforcement learning