Counterexample Explanation by Learning Small Strategies in Markov Decision Processes.
Tomás BrázdilKrishnendu ChatterjeeMartin ChmelikAndreas FellnerJan KretínskýPublished in: CoRR (2015)
Keyphrases
- markov decision processes
- reinforcement learning
- finite state
- state space
- supervised learning
- dynamic programming
- learning tasks
- partially observable
- average reward
- decision theoretic planning
- optimal policy
- stochastic games
- model based reinforcement learning
- belief revision
- machine learning
- finite horizon
- continuous state spaces
- transition matrices
- real time dynamic programming