Balanced Off-Policy Evaluation for Personalized Pricing.
Adam N. ElmachtoubVishal GuptaYunfan ZhaoPublished in: AISTATS (2023)
Keyphrases
- policy evaluation
- least squares
- reinforcement learning
- monte carlo
- temporal difference
- model free
- matrix inversion
- markov decision processes
- variance reduction
- semi parametric
- policy iteration
- function approximation
- e learning
- statistical inference
- partially observable markov decision processes
- linear regression
- evaluation function
- state space
- machine learning
- learning algorithm
- markov chain
- reinforcement learning algorithms
- optimal policy
- fixed point
- finite state
- gaussian process
- optimal control
- step size
- neural network