Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning.

Liangpeng Zhang Ke Tang Xin Yao

Published in: NIPS (2017)

Keyphrases

state action
reinforcement learning
evaluation function
action space
markov decision process
continuous state
function approximators
function approximation
stochastic games
average reward
state transitions
machine learning
state space
optimal policy
reinforcement learning algorithms
learning process
belief state
model free
policy gradient
temporal difference
kernel matrix
learning algorithm
low rank
transfer learning
random variables
hidden markov models
multi agent