Aggregated Multi-deep Deterministic Policy Gradient for Self-driving Policy.

Junta Wu Huiyun Li

Published in: IOV (2018)

Keyphrases

policy gradient
actor critic
model free reinforcement learning
policy gradient methods
function approximation
reinforcement learning
policy search
gradient method
optimal control
reinforcement learning algorithms
approximation methods
average reward
optimal policy
neural network
partially observable markov decision processes
function approximators
markov decision processes
model selection