Distributional Off-Policy Evaluation for Slate Recommendations.
Shreyas ChaudhariDavid ArbourGeorgios TheocharousNikos VlassisPublished in: AAAI (2024)
Keyphrases
- policy evaluation
- least squares
- reinforcement learning
- monte carlo
- temporal difference
- model free
- function approximation
- policy iteration
- markov decision processes
- variance reduction
- matrix inversion
- semi parametric
- recommender systems
- optimal policy
- multi agent
- action selection
- markov decision problems
- linear regression
- statistical inference
- neural network
- fixed point
- cost function
- objective function
- learning algorithm