Accelerating Training in Pommerman with Imitation and Reinforcement Learning.
Hardik MeisheriOmkar ShelkeRicha VermaHarshad KhadilkarPublished in: CoRR (2019)
Keyphrases
- control policy
- reinforcement learning
- batch mode
- supervised learning
- function approximation
- state space
- reinforcement learning algorithms
- markov decision processes
- dynamic programming
- training algorithm
- training set
- temporal difference
- machine learning
- optimal control
- training examples
- learning algorithm
- transfer learning
- test set
- training phase
- learning process
- multi agent
- feedforward neural networks
- markov decision process
- biologically inspired
- training process
- training samples
- data sets
- temporal difference learning