Assessment of Reward Functions in Reinforcement Learning for Multi-Modal Urban Traffic Control under Real-World limitations.
Alvaro Cabrejas EgeaColm ConnaughtonPublished in: CoRR (2020)
Keyphrases
- multi modal
- reward function
- reinforcement learning
- reinforcement learning algorithms
- policy search
- optimal policy
- markov decision processes
- state space
- markov decision process
- multi modality
- function approximation
- transition probabilities
- inverse reinforcement learning
- learning algorithm
- transition model
- temporal difference
- cross modal
- uni modal
- model free
- multiple agents
- state variables
- dynamic programming
- high dimensional
- state action