Iterative Bounding MDPs: Learning Interpretable Policies via Non-Interpretable Methods.

Nicholay Topin Stephanie Milani Fei Fang Manuela Veloso

Published in: CoRR (2021)

Keyphrases

reinforcement learning
significant improvement
empirical studies
learning algorithm
learning models
learning process
state space
optimal policy
reinforcement learning methods
machine learning
preprocessing
dynamic programming
supervised learning
iterative methods