Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning.
Liangpeng ZhangKe TangXin YaoPublished in: NIPS (2017)
Keyphrases
- state action
- reinforcement learning
- evaluation function
- action space
- markov decision process
- continuous state
- function approximators
- function approximation
- stochastic games
- average reward
- state transitions
- machine learning
- state space
- optimal policy
- reinforcement learning algorithms
- learning process
- belief state
- model free
- policy gradient
- temporal difference
- kernel matrix
- learning algorithm
- low rank
- transfer learning
- random variables
- hidden markov models
- multi agent