Computation of weighted sums of rewards for concurrent MDPs.

Peter Buchholz Dimitri Scheftelowitsch

Published in: Math. Methods Oper. Res. (2019)

Keyphrases

markov decision processes
weighted sums
reinforcement learning
state space
optimal policy
reward function
weighted sum
finite horizon
finite state
dynamic programming
markov decision process
policy iteration
decision problems
model free
average cost
markov decision problems