Login / Signup
Undiscounted Markov decision chains with partial information; an algorithm for computing a locally optimal periodic policy.
Arie Hordijk
J. A. Loeve
Published in:
Math. Methods Oper. Res. (1994)
Keyphrases
</>
locally optimal
dynamic programming
globally optimal
learning algorithm
optimal solution
objective function
search space
partial information
machine learning
np hard
linear programming
model free