Publication: Ensemble Usage for More Reliable Policy Identification in Reinforcement Learning.