Keyphrases
- maximum entropy
- actor critic
- reinforcement learning
- optimal control
- maximum entropy principle
- policy gradient
- temporal difference
- approximate dynamic programming
- markov models
- gradient method
- neuro fuzzy
- conditional random fields
- reinforcement learning algorithms
- random fields
- function approximation
- machine learning
- policy iteration
- dynamic programming
- linear program
- average reward
- state space
- latent variables
- prior knowledge