Tableaux for Policy Synthesis for MDPs with PCTL* Constraints.

Peter Baumgartner Sylvie Thiébaux Felipe W. Trevizan

Published in: CoRR (2017)

Keyphrases

optimal policy
markov decision processes
markov decision process
markov decision problems
policy search
finite horizon
reinforcement learning
average reward
policy iteration
average cost
state and action spaces
partially observable
reward function
modal logic
state space
reinforcement learning problems
initial state
long run
planning under uncertainty
linear program
constraint satisfaction