Scalable Deep Reinforcement Learning for Ride-Hailing.
Jiekun FengMark O. GluzmanJim G. DaiPublished in: ACC (2021)
Keyphrases
- reinforcement learning
- function approximation
- multi agent
- optimal policy
- state space
- model free
- learning algorithm
- databases
- reinforcement learning algorithms
- temporal difference
- learning process
- markov decision processes
- markov decision process
- direct policy search
- deep learning
- temporal difference learning
- scale poorly
- database
- highly scalable
- optimal control
- artificial intelligence
- machine learning
- data sets