Login / Signup
Two-Stage Constrained Actor-Critic for Short Video Recommendation.
Qingpeng Cai
Zhenghai Xue
Chi Zhang
Wanqi Xue
Shuchang Liu
Ruohan Zhan
Xueliang Wang
Tianyou Zuo
Wentao Xie
Dong Zheng
Peng Jiang
Kun Gai
Published in:
CoRR (2023)
Keyphrases
</>
actor critic
reinforcement learning
approximate dynamic programming
video sequences
recommender systems
optimal control
collaborative filtering
policy gradient
neural network
temporal difference
gradient method
machine learning
decision making
sufficient conditions
finite state