Complexity bounds for approximately solving discounted MDPs by value iterations.
Eugene A. FeinbergGaojin HePublished in: Oper. Res. Lett. (2020)
Keyphrases
- complexity bounds
- markov decision processes
- semi markov decision processes
- optimal policy
- markov decision problems
- infinite horizon
- average reward
- dynamic programming
- finite horizon
- state space
- markov decision process
- policy iteration
- average cost
- worst case
- factored mdps
- reinforcement learning
- partially observable
- long run
- database
- heuristic search
- query processing
- lower bound
- optimal solution
- database systems