Action Guidance: Getting the Best of Sparse Rewards and Shaped Rewards for Real-time Strategy Games.

Shengyi Huang Santiago Ontañón

Published in: CoRR (2020)

Keyphrases

real time strategy games
reinforcement learning
multiarmed bandit
markov decision processes
bandit problems
reward shaping
expected reward
discounted reward
domain knowledge
multi agent
special case
optimal policy
case based planning