Enhancing upper confidence bounds for trees with temporal difference values.
Tom VodopivecBranko SterPublished in: CIG (2014)
Keyphrases
- temporal difference
- confidence bounds
- td learning
- function approximation
- evaluation function
- reinforcement learning
- monte carlo
- step size
- temporal difference learning
- model free
- reinforcement learning algorithms
- action selection
- decision trees
- neural network
- evolutionary algorithm
- policy evaluation
- data sets
- temporal difference methods
- markov decision processes
- semi supervised
- function approximators
- multiscale