Accelerating Training in Pommerman with Imitation and Reinforcement Learning.

Hardik Meisheri Omkar Shelke Richa Verma Harshad Khadilkar

Published in: CoRR (2019)

Keyphrases

control policy
reinforcement learning
batch mode
supervised learning
function approximation
state space
reinforcement learning algorithms
markov decision processes
dynamic programming
training algorithm
training set
temporal difference
machine learning
optimal control
training examples
learning algorithm
transfer learning
test set
training phase
learning process
multi agent
feedforward neural networks
markov decision process
biologically inspired
training process
training samples
data sets
temporal difference learning