Keyphrases
- policy evaluation
- partially observable markov decision processes
- reinforcement learning
- markov decision processes
- least squares
- finite state
- model free
- belief state
- policy iteration
- optimal policy
- temporal difference
- dynamic programming
- monte carlo
- dynamical systems
- function approximation
- policy gradient
- state space
- partially observable
- decision problems
- multi agent
- variance reduction
- markov decision problems
- semi parametric
- infinite horizon
- average reward
- machine learning
- markov chain
- planning problems
- reinforcement learning algorithms
- probabilistic model
- np hard
- objective function
- decision making