Undiscounted Control Policy Generation for Continuous-Valued Optimal Control by Approximate Dynamic Programming.
Jonathan LockTomas McKelveyPublished in: CoRR (2021)
Keyphrases
- approximate dynamic programming
- optimal control
- continuous valued
- control policy
- policy iteration
- average cost
- infinite horizon
- dynamic programming
- reinforcement learning
- markov decision problems
- markov decision processes
- long run
- control policies
- average reward
- optimal control problems
- finite horizon
- optimal policy
- markov decision process
- linear program
- actor critic
- state space
- control strategy
- multiscale
- hamilton jacobi bellman
- temporal difference
- model free
- step size
- fixed point