Individual versus Difference Rewards on Reinforcement Learning for Route Choice.
Ricardo GrunitzkiGabriel de Oliveira RamosAna Lúcia C. BazzanPublished in: BRACIS (2014)
Keyphrases
- reinforcement learning
- markov decision processes
- function approximation
- model free
- state space
- optimal control
- reinforcement learning algorithms
- learning algorithm
- reinforcement learning methods
- higher level
- supervised learning
- hidden markov models
- reward function
- neural network
- multi agent
- action selection
- policy iteration
- hidden state
- temporal difference
- learning problems
- learning tasks
- optimal policy
- shortest path
- learning process