Shunting Trains with Deep Reinforcement Learning.
Evertjan PeerVlado MenkovskiYingqian ZhangWan-Jui LeePublished in: SMC (2018)
Keyphrases
- reinforcement learning
- function approximation
- temporal difference
- control problems
- reinforcement learning algorithms
- robotic control
- state space
- markov decision processes
- learning process
- optimal policy
- model free
- genetic algorithm
- evolutionary learning
- temporal difference learning
- information systems
- information retrieval
- partially observable
- dynamic programming
- markov decision process
- deep learning
- function approximators
- search space
- reinforcement learning methods
- policy search
- relational reinforcement learning
- belief nets
- direct policy search