Gamma-Models: Generative Temporal Difference Learning for Infinite-Horizon Prediction.
Michael JannerIgor MordatchSergey LevinePublished in: NeurIPS (2020)
Keyphrases
- infinite horizon
- temporal difference learning
- markov decision process
- long run
- partially observable
- markov decision processes
- function approximation
- reinforcement learning
- finite horizon
- optimal control
- regression model
- dynamic programming
- machine learning
- evaluation function
- fixed point
- dynamic environments
- probabilistic model
- artificial neural networks