Case-based off-policy evaluation using prototype learning.

Anton Matsson Fredrik D. Johansson

Published in: UAI (2022)

Keyphrases

learning algorithm
least squares
learning process
active learning
temporal difference
td learning
decision making
training data
monte carlo
domain independent
learning tasks
action selection