Multi-Reward Architecture based Reinforcement Learning for Highway Driving Policies.
Wei YuanMing YangYuesheng HeChunxiang WangBing WangPublished in: ITSC (2019)
Keyphrases
- reinforcement learning
- optimal policy
- reward function
- total reward
- control policies
- reinforcement learning algorithms
- markov decision processes
- function approximation
- learning capabilities
- state space
- policy search
- control policy
- markov decision process
- partially observable markov decision processes
- eligibility traces
- management system
- learning algorithm
- hierarchical reinforcement learning
- average reward
- temporal difference
- markov decision problems
- model free
- real time
- reinforcement learning agents
- decision problems
- multi agent
- machine learning
- partially observable
- traffic safety
- policy gradient
- continuous state
- action selection
- infinite horizon
- long run
- reinforcement learning methods
- policy evaluation
- traffic accidents
- inverse reinforcement learning
- optimal control
- software architecture
- reward shaping
- partially observable environments
- fitted q iteration