Keyphrases
- reinforcement learning
- actor critic
- exploration strategy
- learning algorithm
- dynamic programming
- optimization algorithm
- mathematical model
- function approximation
- model free
- unknown environments
- approximate dynamic programming
- optimal solution
- state space
- simulated annealing
- convergence rate
- locally optimal
- machine learning
- computational complexity
- gradient method
- policy gradient
- objective function