The Actor-Critic Algorithm for Infinite Horizon Discounted Cost Revisited.
Abhijit GosaviPublished in: WSC (2020)
Keyphrases
- infinite horizon
- dynamic programming
- learning algorithm
- optimal control
- cost function
- markov decision processes
- long run
- actor critic
- average reward
- optimal policy
- linear programming
- np hard
- mathematical model
- average cost
- total cost
- objective function
- policy iteration
- policy gradient
- machine learning
- probabilistic model
- search space
- computational complexity
- optimal solution
- reinforcement learning