On Value Iteration Convergence in Connected MDPs.
Arsenii MustafinAlex OlshevskyIoannis Ch. PaschalidisPublished in: CoRR (2024)
Keyphrases
- stochastic shortest path
- markov decision processes
- state space
- policy iteration
- optimal policy
- markov decision process
- markov decision problems
- reinforcement learning
- finite state
- average reward
- reinforcement learning algorithms
- factored mdps
- dynamic programming
- decision processes
- finite horizon
- stationary policies
- learning algorithm
- infinite horizon
- convergence speed
- convergence rate
- decision theoretic planning
- semi markov decision processes
- algebraic decision diagrams
- average cost
- partially observable
- long run
- connected components
- decision problems
- heuristic search
- state and action spaces
- neural network