Distributional Multivariate Policy Evaluation and Exploration with the Bellman GAN.
Dror FreirichRon MeirAviv TamarPublished in: CoRR (2018)
Keyphrases
- policy evaluation
- least squares
- temporal difference
- monte carlo
- reinforcement learning
- model free
- policy iteration
- markov decision processes
- variance reduction
- function approximation
- action selection
- semi parametric
- regression model
- optimal policy
- statistical inference
- neural network
- linear program
- linear regression
- gaussian process
- importance sampling