Login / Signup
Computation of weighted sums of rewards for concurrent MDPs.
Peter Buchholz
Dimitri Scheftelowitsch
Published in:
Math. Methods Oper. Res. (2019)
Keyphrases
</>
markov decision processes
weighted sums
reinforcement learning
state space
optimal policy
reward function
weighted sum
finite horizon
finite state
dynamic programming
markov decision process
policy iteration
decision problems
model free
average cost
markov decision problems