Login / Signup
Improving Policy Generalization for Teacher-Student Reinforcement Learning.
Xudong Gong
Hongda Jia
Xing Zhou
Dawei Feng
Bo Ding
Jie Xu
Published in:
KSEM (2) (2020)
Keyphrases
</>
reinforcement learning
optimal policy
policy search
teacher student
action selection
machine learning
state space
learning algorithm
learning process
online learning
decision process
reward function
policy makers
markov decision processes