iLSTD: Eligibility Traces and Convergence Analysis.
Alborz GeramifardMichael H. BowlingMartin ZinkevichRichard S. SuttonPublished in: NIPS (2006)
Keyphrases
- convergence analysis
- policy evaluation
- least squares
- temporal difference
- monte carlo
- reinforcement learning
- model free
- reinforcement learning algorithms
- markov decision processes
- policy iteration
- function approximation
- variance reduction
- global convergence
- semi parametric
- convergence rate
- evaluation function
- statistical inference
- approximation methods
- gaussian process
- markov chain
- optimal policy
- convergence speed
- optical flow
- markov decision problems
- genetic algorithm