Publication: Optimistic Policy Iteration for MDPs with Acyclic Transient State Structure.