Action Advising with Advice Imitation in Deep Reinforcement Learning.

Ercument Ilhan Jeremy Gow Diego Perez Liebana

Published in: AAMAS (2021)

Keyphrases

reinforcement learning
action selection
action space
reward shaping
state action
partially observable domains
state space
function approximation
multi agent
markov decision processes
continuous state
temporal difference
reinforcement learning algorithms
transition model
learning algorithm
optimal policy
agent learns
fitted q iteration
deep learning
model free
dynamic programming
machine learning
function approximators
temporal difference learning
learning classifier systems
imitation learning
human actions
genetic algorithm
neural network