Login / Signup

Undiscounted Markov decision chains with partial information; an algorithm for computing a locally optimal periodic policy.

Arie HordijkJ. A. Loeve
Published in: Math. Methods Oper. Res. (1994)
Keyphrases
  • locally optimal
  • dynamic programming
  • globally optimal
  • learning algorithm
  • optimal solution
  • objective function
  • search space
  • partial information
  • machine learning
  • np hard
  • linear programming
  • model free