Extending the Peak Bandwidth of Parameters for Softmax Selection in Reinforcement Learning.

Published in: IEEE Trans. Neural Networks Learn. Syst. (2017)

Keyphrases

reinforcement learning
temporal difference learning
maximum likelihood
parameter estimation
function approximation
learning algorithm
fine tuning
parameter values
markov decision processes
state space
parameter space
parameter settings
bayesian networks
model free
reinforcement learning algorithms
image sequences
neural network