Finite-Horizon Markov Decision Processes with Sequentially-Observed Transitions.
Mahmoud El ChamieBehçet AçikmesePublished in: CoRR (2015)
Keyphrases
- finite horizon
- markov decision processes
- optimal policy
- optimal stopping
- infinite horizon
- state space
- finite state
- reinforcement learning
- dynamic programming
- average cost
- policy iteration
- markov decision process
- average reward
- partially observable
- control policies
- decision theoretic planning
- single item
- expected reward
- transition matrices
- action space
- decision problems
- machine learning
- model checking
- search algorithm