Real-time Policy Distillation in Deep Reinforcement Learning.

Yuxiang Sun Pooyan Fazli

Published in: CoRR (2019)

Keyphrases

real time
reinforcement learning
optimal policy
action selection
policy search
robotic control
state space
action space
function approximation
learning algorithm

optimal control
markov decision process
partially observable
function approximators
approximate dynamic programming
policy gradient
data sets
dynamic programming
markov decision processes
supervised learning

low cost
control policies
partially observable environments
state and action spaces
reinforcement learning problems
inverse reinforcement learning
actor critic
high speed
markov decision problems

infinite horizon
temporal difference learning
state action
transfer learning
state dependent
finite state
asymptotically optimal
learning classifier systems
reward function