Login / Signup

Tuning the Discount Factor in Order to Reach Average Optimality on Deterministic MDPs.

Filipo Studzinski PerottoLaurent Vercouter
Published in: SGAI Conf. (2018)
Keyphrases
  • markov decision processes
  • average cost
  • optimal policy
  • average reward
  • state space
  • learning algorithm
  • search space
  • least squares
  • finite horizon
  • discount factor