A reinforcement learning approach to the orienteering problem with time windows.
Ricardo GamaHugo L. FernandesPublished in: Comput. Oper. Res. (2021)
Keyphrases
- reinforcement learning
- vehicle routing
- state space
- function approximation
- model free
- vehicle routing problem
- temporal difference learning
- optimal policy
- routing problem
- temporal difference
- reinforcement learning algorithms
- markov decision processes
- real time
- multi agent
- robotic control
- policy search
- learning agent
- learning algorithm
- machine learning
- neural network
- data sets