Reinforcement Learning to Optimize Lifetime Value in Cold-Start Recommendation.
Luo JiQi QinBingqing HanHongxia YangPublished in: CIKM (2021)
Keyphrases
- reinforcement learning
- function approximation
- machine learning
- learning algorithm
- state space
- real world
- direct policy search
- temporal difference learning
- reinforcement learning algorithms
- model free
- transfer learning
- multi agent
- dynamic programming
- learning process
- least squares
- markov chain
- optimal policy
- markov decision processes
- multiscale
- decision making
- temporal difference
- computer vision
- robot control
- artificial intelligence
- policy search
- robotic control
- neural network