Login / Signup
Wide-Sense Stationary Policy Optimization with Bellman Residual on Video Games.
Chen Gong
Qiang He
Yunpeng Bai
Xinwen Hou
Guoliang Fan
Yu Liu
Published in:
ICME (2021)
Keyphrases
</>
video games
learning experience
markov decision processes
computer games
fixed point
finite state
markov chain
policy iteration
machine learning
multi agent
random walk
optimal policy
markov decision process