Shunting Trains with Deep Reinforcement Learning.

Evertjan Peer Vlado Menkovski Yingqian Zhang Wan-Jui Lee

Published in: SMC (2018)

Keyphrases

reinforcement learning
function approximation
temporal difference
control problems
reinforcement learning algorithms
robotic control
state space
markov decision processes
learning process
optimal policy
model free
genetic algorithm
evolutionary learning
temporal difference learning
information systems
information retrieval
partially observable
dynamic programming
markov decision process
deep learning
function approximators
search space
reinforcement learning methods
policy search
relational reinforcement learning
belief nets
direct policy search