On the convergence of optimistic policy iteration for stochastic shortest path problem.
Yuanlong ChenPublished in: CoRR (2018)
Keyphrases
- shortest path problem
- stochastic approximation
- policy iteration
- markov decision processes
- shortest path
- sample path
- reinforcement learning
- fixed point
- model free
- optimal policy
- interval data
- least squares
- combinatorial optimization problems
- convergence rate
- temporal difference
- markov decision process
- directed graph
- monte carlo
- average reward
- policy evaluation
- infinite horizon
- directed acyclic graph
- multiple objectives
- bi objective
- finite state
- state space
- metaheuristic
- simulated annealing
- optimal control
- knapsack problem
- linear programming
- ant colony optimization
- machine learning