Login / Signup
Tuning the Discount Factor in Order to Reach Average Optimality on Deterministic MDPs.
Filipo Studzinski Perotto
Laurent Vercouter
Published in:
SGAI Conf. (2018)
Keyphrases
</>
markov decision processes
average cost
optimal policy
average reward
state space
learning algorithm
search space
least squares
finite horizon
discount factor