Reinforcement Learning in POMDPs with Function Approximation.
Hajime KimuraKazuteru MiyazakiShigenobu KobayashiPublished in: ICML (1997)
Keyphrases
- function approximation
- reinforcement learning
- partially observable markov decision processes
- temporal difference
- continuous state
- partially observable
- temporal difference learning
- reinforcement learning algorithms
- model free
- policy gradient
- state space
- control problems
- optimal policy
- markov decision processes
- mountain car
- machine learning
- multi agent
- learning algorithm
- temporal difference learning algorithms
- function approximators
- markov decision problems
- supervised learning
- transfer learning
- policy search
- dynamic programming
- markov decision process
- action space
- belief state
- markov chain
- temporal difference methods
- learning process