Model-free Q-learning over finite horizon for uncertain linear continuous-time systems.
Hao XuSarangapani JagannathanPublished in: ADPRL (2014)
Keyphrases
- model free
- reinforcement learning
- finite horizon
- function approximation
- reinforcement learning algorithms
- optimal policy
- infinite horizon
- state space
- markov decision processes
- action selection
- policy iteration
- policy evaluation
- control policies
- average reward
- markov decision process
- average cost
- temporal difference
- control strategies
- finite state
- multistage
- markov chain