Deep predictive policy training using reinforcement learning.
Ali GhadirzadehAtsuto MakiDanica KragicMårten BjörkmanPublished in: IROS (2017)
Keyphrases
- reinforcement learning
- optimal policy
- markov decision process
- policy search
- reinforcement learning problems
- supervised learning
- reinforcement learning algorithms
- function approximation
- function approximators
- training set
- reward function
- action selection
- markov decision problems
- actor critic
- online learning
- partially observable environments
- state space
- learning process
- state and action spaces
- training process
- model free
- multi agent
- policy gradient
- average reward
- transition model
- continuous state spaces
- machine learning
- partially observable domains
- state action
- action space
- partially observable markov decision processes
- control problems
- training algorithm
- learning problems
- test set
- training examples
- least squares
- learning algorithm