Sign in

Wide-Sense Stationary Policy Optimization with Bellman Residual on Video Games.

Chen GongQiang HeYunpeng BaiXinwen HouGuoliang FanYu Liu
Published in: ICME (2021)
Keyphrases
  • video games
  • learning experience
  • markov decision processes
  • computer games
  • fixed point
  • finite state
  • markov chain
  • policy iteration
  • machine learning
  • multi agent
  • random walk
  • optimal policy
  • markov decision process