Login / Signup
Generator and Critic: A Deep Reinforcement Learning Approach for Slate Re-ranking in E-commerce.
Jianxiong Wei
Anxiang Zeng
Yueqiu Wu
Peng Guo
Qingsong Hua
Qingpeng Cai
Published in:
CoRR (2020)
Keyphrases
</>
reinforcement learning
function approximation
actor critic
reinforcement learning algorithms
temporal difference
ranking algorithm
electronic commerce
web search
model free
policy gradient
product search
multi agent
state space
ranking functions
learning to rank
learning process
evaluation function
optimal control
markov decision processes
supervised learning
reinforcement learning methods
policy search
information retrieval
machine learning
learning problems
user feedback
link analysis
optimal policy
temporal difference learning
dynamic programming
approximate dynamic programming