Expert-augmented actor-critic for ViZDoom and Montezumas Revenge.
Michal GarmulewiczHenryk MichalewskiPiotr MilosPublished in: CoRR (2018)
Keyphrases
- actor critic
- reinforcement learning
- approximate dynamic programming
- optimal control
- temporal difference
- policy gradient
- reinforcement learning algorithms
- gradient method
- neuro fuzzy
- function approximation
- policy iteration
- markov decision processes
- belief revision
- neural network
- model free
- least squares
- learning algorithm