Model-based policy gradients with parameter-based exploration by least-squares conditional density estimation.
Voot TangkarattSyogo MoriTingting ZhaoJun MorimotoMasashi SugiyamaPublished in: Neural Networks (2014)
Keyphrases
- least squares
- density ratio estimation
- density ratio
- conditional density estimation
- policy evaluation
- linear model
- action selection
- optimal policy
- policy iteration
- optical flow
- model free
- linear regression
- decision trees
- ls svm
- computer vision
- monte carlo
- regularization parameter
- gaussian mixture model
- model selection
- dynamic programming
- image sequences