Balanced Off-Policy Evaluation for Personalized Pricing.
Adam N. ElmachtoubVishal GuptaYunfan ZhaoPublished in: CoRR (2023)
Keyphrases
- policy evaluation
- least squares
- temporal difference
- reinforcement learning
- monte carlo
- model free
- matrix inversion
- markov decision processes
- policy iteration
- function approximation
- variance reduction
- semi parametric
- e learning
- optimal policy
- partially observable markov decision processes
- sufficient conditions
- step size
- importance sampling
- neural network
- gaussian process
- finite state
- text classification
- linear programming
- artificial neural networks
- decision making