State-Visitation Fairness in Average-Reward MDPs.

Ganesh Ghalme Vineet Nair Vishakha Patil Yilun Zhou

Published in: CoRR (2021)

Keyphrases

average reward
markov decision processes
discounted reward
state space
optimal policy
state action
long run
semi markov decision processes
total reward
reinforcement learning
policy iteration
discount factor
stochastic games
action space
optimality criterion
model free
markov decision process
hierarchical reinforcement learning
real time dynamic programming
state and action spaces
initial state
game theory
linear programming
dynamic programming