Markov Decision Processes with Unknown State Feature Values for Safe Exploration using Gaussian Processes.
Matthew BuddBruno LacerdaPaul DuckworthAndrew WestBarry LennoxNick HawesPublished in: IROS (2020)
Keyphrases
- markov decision processes
- gaussian processes
- state space
- feature values
- model based reinforcement learning
- gaussian process
- optimal policy
- reinforcement learning
- real time dynamic programming
- dynamic programming
- training data
- higher order
- missing values
- multi task
- average reward
- low resolution
- information content
- supervised learning
- prior knowledge