Trajectory-Based Off-Policy Deep Reinforcement Learning.

Andreas Doerr Michael Volpp Marc Toussaint Sebastian Trimpe Christian Daniel

Published in: ICML (2019)

Keyphrases

reinforcement learning
function approximation
state space
reinforcement learning algorithms
multi agent
robotic control
markov decision processes
learning process
model free
trajectory data
optimal policy
temporal difference learning
dynamic programming
spatio temporal
action selection
neural network
function approximators
trajectories of moving objects
belief nets
real time