Policy Gradient Based Reinforcement Learning Approach for Autonomous Highway Driving.
Szilárd AradiTamás BécsiPeter GasparPublished in: CCTA (2018)
Keyphrases
- reinforcement learning
- optimal policy
- policy search
- autonomous learning
- model free
- action selection
- markov decision process
- policy iteration
- partially observable environments
- markov decision problems
- policy evaluation
- traffic accidents
- markov decision processes
- actor critic
- reinforcement learning problems
- control policy
- partially observable
- action space
- state and action spaces
- traffic safety
- state space
- reward function
- reinforcement learning algorithms
- function approximation
- function approximators
- state action
- control policies
- agent receives
- machine learning
- least squares
- policy gradient
- long run
- learning process
- learning algorithm
- dynamic programming
- transition model
- decision problems
- policy gradient methods
- approximate dynamic programming
- learning capabilities
- rl algorithms
- average reward
- infinite horizon
- temporal difference
- neural network
- unmanned aerial vehicles
- control problems
- partially observable markov decision processes
- autonomous mental development
- agent learns
- continuous state spaces
- finite state
- reinforcement learning methods