Login / Signup
Approximate Relative Value Learning for Average-reward Continuous State MDPs.
Hiteshi Sharma
Mehdi Jafarnia-Jahromi
Rahul Jain
Published in:
UAI (2019)
Keyphrases
</>
reinforcement learning
continuous state
average reward
markov decision processes
state action
learning algorithm
optimal policy
policy search
state space
stochastic games
multi agent
supervised learning
long run
continuous state spaces
semi markov decision processes