Gamma-Models: Generative Temporal Difference Learning for Infinite-Horizon Prediction.

Michael Janner Igor Mordatch Sergey Levine

Published in: NeurIPS (2020)

Keyphrases

infinite horizon
temporal difference learning
markov decision process
long run
partially observable
markov decision processes
function approximation
reinforcement learning
finite horizon
optimal control
regression model
dynamic programming
machine learning
evaluation function
fixed point
dynamic environments
probabilistic model
artificial neural networks