Publication: SOAP-RL: Sequential Option Advantage Propagation for Reinforcement Learning in POMDP Environments.