Safe Reinforcement Learning for Single Train Trajectory Optimization via Shield SARSA.
Zicong ZhaoJing XunXuguang WenJianqiu ChenPublished in: IEEE Trans. Intell. Transp. Syst. (2023)
Keyphrases
- reinforcement learning
- function approximation
- reinforcement learning algorithms
- state space
- temporal difference
- optimization algorithm
- optimal control
- multi agent
- global optimization
- function approximators
- optimization problems
- machine learning
- optimization process
- action selection
- optimization method
- temporal difference learning
- model free
- markov decision processes
- dynamic programming
- learning process
- learning algorithm