Off-policy evaluation for slate recommendation.
Adith SwaminathanAkshay KrishnamurthyAlekh AgarwalMiroslav DudíkJohn LangfordDamien JoseImed ZitouniPublished in: CoRR (2016)
Keyphrases
- policy evaluation
- least squares
- temporal difference
- monte carlo
- reinforcement learning
- markov decision processes
- model free
- matrix inversion
- policy iteration
- variance reduction
- collaborative filtering
- recommender systems
- semi parametric
- function approximation
- statistical inference
- evaluation function
- optimal policy
- partially observable markov decision processes
- linear regression
- dynamic programming
- cost function