PNLMS-based Algorithm for Online Approximated Solution of HJB Equation in the Context of Discrete MIMO Optimal Control and Reinforcement Learning.
Marcio Eduardo G. SilvaJoão Viana da Fonseca NetoFrancisco das Chagas de SouzaPublished in: UKSim (2014)
Keyphrases
- optimal control
- dynamic programming
- hamilton jacobi bellman
- reinforcement learning
- mathematical model
- control problems
- optimal control problems
- control strategy
- optimal solution
- actor critic
- learning algorithm
- infinite horizon
- markov decision processes
- control policy
- rl algorithms
- average cost
- real time
- linear quadratic
- policy gradient
- linear programming
- np hard