γ-Models: Generative Temporal Difference Learning for Infinite-Horizon Prediction.

Michael Janner Igor Mordatch Sergey Levine

Published in: CoRR (2020)

Keyphrases

infinite horizon
temporal difference learning
markov decision process
optimal policy
fixed point
long run
policy iteration
multi agent systems
evaluation function
finite horizon
learning algorithm
model selection
optimal control
stochastic processes