Sign in
Average Reward Optimization with Multiple Discounting Reinforcement Learners.
Chris Reinke
Eiji Uchibe
Kenji Doya
Published in:
ICONIP (1) (2017)
Keyphrases
</>
average reward
reinforcement learning
optimal policy
discounted reward
semi markov decision processes
markov random field
monte carlo
markov decision processes
machine learning
learning algorithm
e learning
multi agent
learning process
state space
long run
stochastic games