Interval Dominance based Structural Results for Markov Decision Process.
Vikram KrishnamurthyPublished in: CoRR (2022)
Keyphrases
- markov decision process
- state space
- markov decision processes
- optimal policy
- reinforcement learning
- finite horizon
- infinite horizon
- temporal difference learning
- initial state
- partial observability
- policy iteration
- transition matrices
- partially observable
- dominance relation
- reward function
- decision problems
- linear programming