Hyperbolically Discounted Temporal Difference Learning.
William H. AlexanderJoshua W. BrownPublished in: Neural Comput. (2010)
Keyphrases
- temporal difference learning
- markov decision process
- markov decision processes
- function approximation
- infinite horizon
- reinforcement learning
- policy iteration
- fixed point
- optimal policy
- reinforcement learning algorithms
- game playing
- approximate value iteration
- temporal difference
- evaluation function
- dynamic programming
- state space
- average cost
- initial state
- long run
- optimal control
- monte carlo
- model free
- real valued
- radial basis function