SOAC: Supervised Off-Policy Actor-Critic for Recommender Systems.
Shiqing WuGuandong XuXianzhi WangPublished in: ICDM (2023)
Keyphrases
- transfer learning
- actor critic
- recommender systems
- reinforcement learning
- collaborative filtering
- supervised learning
- learning algorithm
- policy gradient
- approximate dynamic programming
- optimal control
- reinforcement learning algorithms
- temporal difference
- matrix factorization
- machine learning
- function approximation
- gradient method
- policy iteration
- state space
- markov decision processes
- neuro fuzzy
- multi agent
- feature selection
- mathematical model
- single agent
- average reward
- optimal policy
- search space