Distributional Multivariate Policy Evaluation and Exploration with the Bellman GAN.
Dror FreirichTzahi ShimkinRon MeirAviv TamarPublished in: ICML (2019)
Keyphrases
- policy evaluation
- least squares
- temporal difference
- reinforcement learning
- monte carlo
- model free
- markov decision processes
- variance reduction
- policy iteration
- semi parametric
- regression model
- function approximation
- action selection
- linear program
- linear regression
- optical flow
- optimal policy
- confidence intervals
- neural network