A reinforcement learning approach to the orienteering problem with time windows.

Ricardo Gama Hugo L. Fernandes

Published in: Comput. Oper. Res. (2021)

Keyphrases

reinforcement learning
vehicle routing
state space
function approximation
model free
vehicle routing problem
temporal difference learning
optimal policy
routing problem
temporal difference
reinforcement learning algorithms
markov decision processes
real time
multi agent
robotic control
policy search
learning agent
learning algorithm
machine learning
neural network
data sets