When to stop value iteration: stability and near-optimality versus computation.
Mathieu GranzottoRomain PostoyanDragan NesicLucian BusoniuJamal DaafouzPublished in: L4DC (2021)
Keyphrases
- markov decision processes
- heuristic search
- state space
- numerical stability
- optimal solution
- video sequences
- stability analysis
- average cost
- efficient computation
- optimal policy
- linear programming
- markov decision chains
- average reward
- machine learning
- dynamic programming
- search space
- computational complexity
- similarity measure
- information systems