Login / Signup
Hierarchical Average Reward Policy Gradient Algorithms.
Akshay Dharmavaram
Matthew Riemer
Shalabh Bhatnagar
Published in:
CoRR (2019)
Keyphrases
</>
average reward
policy gradient
gradient ascent
markov decision processes
policy gradient reinforcement learning
model free
policy iteration
reinforcement learning
optimal policy
actor critic
computational complexity
machine learning algorithms
optimization methods
long run
approximation methods
stochastic games