Teachable Reinforcement Learning via Advice Distillation.

Olivia Watkins Trevor Darrell Pieter Abbeel Jacob Andreas Abhishek Gupta

Published in: CoRR (2022)

Keyphrases

reinforcement learning
function approximation
temporal difference
reinforcement learning algorithms
state space
learning algorithm
action selection
robotic control
learning process
optimal policy
transfer learning
reinforcement learning methods
control problems
model free
markov decision processes
dynamic programming
database
supervised learning
search algorithm
temporal difference learning
multiscale
relational reinforcement learning
data sets