Convergence of value iterations for total-cost MDPs and POMDPs with general state and action sets.

Eugene A. Feinberg Pavlo O. Kasyanov Michael Z. Zgurovsky

Published in: ADPRL (2014)

Keyphrases

total cost
action sets
markov decision processes
state space
reinforcement learning
average cost
partially observable
finite state
stationary policies
optimal solution
belief state
special case
action space
markov decision process
dynamic programming
minimum total cost
policy iteration
planning under uncertainty
markov decision problems
machine learning
production cost
state variables
inventory level
state transitions
search space
learning algorithm