Iterative Bounding MDPs: Learning Interpretable Policies via Non-Interpretable Methods.

Nicholay Topin Stephanie Milani Fei Fang Manuela Veloso

Published in: AAAI (2021)

Keyphrases

empirical studies
learning algorithm
learning process
online learning
optimal policy
learning models
significant improvement
state space
supervised learning
stochastic domains
policy search
iterative learning
partially observable
human experts
learning systems
preprocessing
decision trees