Warm-Start Actor-Critic: From Approximation Error to Sub-optimality Gap.
Hang WangSen LinJunshan ZhangPublished in: CoRR (2023)
Keyphrases
- approximation error
- actor critic
- average reward
- reinforcement learning
- policy gradient
- optimal control
- approximate dynamic programming
- gradient method
- neuro fuzzy
- temporal difference
- reinforcement learning algorithms
- function approximation
- markov decision processes
- policy iteration
- long run
- average cost
- reconstruction error
- optimal policy
- optimal solution
- model free
- estimation error
- machine learning