Keyphrases
- actor critic
- policy gradient
- reinforcement learning
- temporal difference
- approximate dynamic programming
- optimal control
- action selection
- neuro fuzzy
- gradient method
- function approximation
- machine learning
- situation calculus
- reward function
- search space
- state action
- policy iteration
- initial state
- reinforcement learning algorithms
- dynamic programming
- simulated annealing