Login / Signup
Pessimistic Reward Models for Off-Policy Learning in Recommendation.
Olivier Jeunen
Bart Goethals
Published in:
RecSys (2021)
Keyphrases
</>
reinforcement learning
learning process
knowledge acquisition
learning models
hidden variables
learning capabilities
learning algorithm
prior knowledge
accurate models
semi supervised
learning tasks
neural nets
learned models
state action