Learning Non-Markovian Reward Models in MDPs.
Gavin RensJean-François RaskinPublished in: CoRR (2020)
Keyphrases
- reinforcement learning
- learning algorithm
- prior knowledge
- markov decision processes
- supervised learning
- learning process
- learning models
- accurate models
- online learning
- state space
- generative model
- optimal policy
- learning tasks
- probabilistic model
- model free
- qualitative models
- partially observable
- structured prediction
- learned models
- learning agent
- search space