Extending the Peak Bandwidth of Parameters for Softmax Selection in Reinforcement Learning.
Kazunori IwataPublished in: IEEE Trans. Neural Networks Learn. Syst. (2017)
Keyphrases
- reinforcement learning
- temporal difference learning
- maximum likelihood
- parameter estimation
- function approximation
- learning algorithm
- fine tuning
- parameter values
- markov decision processes
- state space
- parameter space
- parameter settings
- bayesian networks
- model free
- reinforcement learning algorithms
- image sequences
- neural network