Accountable Off-Policy Evaluation With Kernel Bellman Statistics.
Yihao FengTongzheng RenZiyang TangQiang LiuPublished in: CoRR (2020)
Keyphrases
- policy evaluation
- least squares
- monte carlo
- reinforcement learning
- temporal difference
- model free
- statistical inference
- policy iteration
- function approximation
- variance reduction
- kernel methods
- semi parametric
- kernel function
- markov decision processes
- linear program
- support vector
- machine learning
- partially observable markov decision processes
- kernel matrix
- optimal policy
- finite state
- confidence intervals
- markov chain