Value Penalized Q-Learning for Recommender Systems.
Chengqian GaoKe XuPeilin ZhaoPublished in: CoRR (2021)
Keyphrases
- recommender systems
- reinforcement learning
- collaborative filtering
- cooperative
- function approximation
- least squares
- learning algorithm
- multi agent
- state space
- maximum likelihood
- reinforcement learning algorithms
- matrix factorization
- loss function
- trust aware
- information filtering
- user preferences
- user profiles
- stochastic approximation
- optimal policy
- action selection
- user model
- implicit feedback
- model free
- user modeling
- information overload
- recommendation systems
- learning rate
- variable selection
- potential field
- multi agent reinforcement learning
- model selection
- data sparsity
- user profiling
- temporal difference learning
- cold start problem
- product recommendation
- reinforcement learning methods
- personalized recommendation
- user modelling
- dynamic environments
- user interests
- dynamic programming