Trajectory-Based Off-Policy Deep Reinforcement Learning.
Andreas DoerrMichael VolppMarc ToussaintSebastian TrimpeChristian DanielPublished in: ICML (2019)
Keyphrases
- reinforcement learning
- function approximation
- state space
- reinforcement learning algorithms
- multi agent
- robotic control
- markov decision processes
- learning process
- model free
- trajectory data
- optimal policy
- temporal difference learning
- dynamic programming
- spatio temporal
- action selection
- neural network
- function approximators
- trajectories of moving objects
- belief nets
- real time