Modelling Agent Policies with Interpretable Imitation Learning.

Tom Bewley Jonathan Lawry Arthur Richards

Published in: CoRR (2020)

Keyphrases

imitation learning
reinforcement learning
reward function
multi agent
maximum margin
multi agent systems
robotic systems
optimal policy
humanoid robot
multiple agents
markov decision process
machine learning
dynamic environments
human teacher
action selection
model construction
agent model
state space
markov decision processes
computer vision