Markov Decision Processes with Sample Path Constraints: The Communicating Case.
Keith W. RossRavi VaradarajanPublished in: Oper. Res. (1989)
Keyphrases
- markov decision processes
- sample path
- policy iteration
- average reward
- reinforcement learning
- state space
- finite state
- optimal policy
- dynamic programming
- policy evaluation
- asymptotic analysis
- fixed point
- infinite horizon
- model free
- least squares
- finite horizon
- average cost
- partially observable
- markov decision process
- constrained optimization
- optimal control
- search algorithm
- markov decision problems
- machine learning