Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning.
Anton BakhtinDavid J. WuAdam LererJonathan GrayAthul Paul JacobGabriele FarinaAlexander H. MillerNoam BrownPublished in: ICLR (2023)
Keyphrases
- interactive narrative
- reinforcement learning
- function approximation
- state space
- multi agent
- educational games
- action selection
- government agencies
- learning process
- human subjects
- domain independent
- serious games
- optimal policy
- virtual world
- partially observable
- action space
- temporal difference learning
- partial observability
- behavioural cloning