Login / Signup
Two-Stage Constrained Actor-Critic for Short Video Recommendation.
Qingpeng Cai
Zhenghai Xue
Chi Zhang
Wanqi Xue
Shuchang Liu
Ruohan Zhan
Xueliang Wang
Tianyou Zuo
Wentao Xie
Dong Zheng
Peng Jiang
Kun Gai
Published in:
WWW (2023)
Keyphrases
</>
actor critic
video sequences
temporal difference
collaborative filtering
reinforcement learning
policy gradient
approximate dynamic programming
optimal control
gradient method
learning algorithm
recommender systems