Performance Guarantees for Homomorphisms beyond Markov Decision Processes.
Sultan Javed MajeedMarcus HutterPublished in: AAAI (2019)
Keyphrases
- markov decision processes
- state space
- finite state
- optimal policy
- reinforcement learning
- dynamic programming
- policy iteration
- reachability analysis
- finite horizon
- reinforcement learning algorithms
- action space
- planning under uncertainty
- transition matrices
- partially observable
- action sets
- average reward
- risk sensitive
- model based reinforcement learning
- average cost
- decision processes
- state and action spaces
- factored mdps
- decision theoretic planning
- real valued
- learning algorithm
- markov decision process
- initial state
- reward function
- discounted reward
- dynamical systems