On-Line Policy Iteration for Infinite Horizon Dynamic Programming.
Dimitri P. BertsekasPublished in: CoRR (2021)
Keyphrases
- infinite horizon
- policy iteration
- dynamic programming
- markov decision processes
- optimal policy
- optimal control
- finite horizon
- markov decision problems
- sample path
- state space
- production planning
- markov decision process
- partially observable
- long run
- average reward
- single item
- dec pomdps
- policy evaluation
- average cost
- multistage
- finite state
- lead time
- linear programming
- reinforcement learning
- decision problems
- stereo matching
- function approximation
- linear program
- lost sales
- lower bound