Model-free Reinforcement Learning with Stochastic Reward Stabilization for Recommender Systems.
Tianchi CaiShenliao BaoJiyan JiangShiji ZhouWenpeng ZhangLihong GuJinjie GuGuannan ZhangPublished in: SIGIR (2023)
Keyphrases
- model free reinforcement learning
- recommender systems
- reinforcement learning
- policy gradient
- function approximation
- collaborative filtering
- state space
- reinforcement learning algorithms
- gradient method
- user profiles
- markov decision processes
- optimal control
- model free
- dynamic programming
- matrix factorization
- control problems
- single agent
- learning automata
- artificial neural networks