Offline Meta-Reinforcement Learning with Advantage Weighting.

Eric Mitchell Rafael Rafailov Xue Bin Peng Sergey Levine Chelsea Finn

Published in: CoRR (2020)

Keyphrases

reinforcement learning
function approximation
real time
machine learning
multi agent
optimal policy
robotic control
temporal difference learning
reinforcement learning algorithms
meta level
markov decision processes
mobile robot
dynamic programming
optimal control
tf idf
similarity measure
model free
action selection
artificial intelligence
robot control
learning agents
learning algorithm
reinforcement learning methods
databases