Case-based off-policy policy evaluation using prototype learning.

Anton Matsson Fredrik D. Johansson

Published in: CoRR (2021)

Keyphrases

learning process
learning algorithm
reinforcement learning
td learning
temporal difference
neural network
machine learning
feature selection
active learning
least squares
supervised learning
learning tasks
action selection