Bayesian Counterfactual Mean Embeddings and Off-Policy Evaluation.
Diego Martinez-TaboadaDino SejdinovicPublished in: CoRR (2022)
Keyphrases
- policy evaluation
- least squares
- temporal difference
- reinforcement learning
- monte carlo
- model free
- statistical inference
- policy iteration
- variance reduction
- markov decision processes
- bayesian networks
- semi parametric
- function approximation
- posterior distribution
- evaluation function
- maximum likelihood
- dimensionality reduction
- optimal policy
- low dimensional
- upper bound
- decision theory
- belief revision
- optical flow
- state space