Login / Signup
Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning.
Tetsuro Morimura
Eiji Uchibe
Junichiro Yoshimoto
Jan Peters
Kenji Doya
Published in:
Neural Comput. (2010)
Keyphrases
</>
stationary distribution
markov chain
random walk
product form
higher order
initial state
queue length
queueing networks
sufficient conditions
transition probabilities
service times
steady state
machine learning
search engine
state space