Parametrized Actor-Critic Algorithms for Finite-Horizon MDPs.
Mohammed Shahid AbdullaShalabh BhatnagarPublished in: ACC (2007)
Keyphrases
- finite horizon
- markov decision processes
- policy iteration
- infinite horizon
- optimal policy
- learning algorithm
- computational complexity
- finite state
- reinforcement learning
- dynamic programming
- state space
- actor critic
- markov decision process
- partially observable markov decision processes
- average cost
- long run
- optimization methods
- decision making