Batch Active Learning of Reward Functions from Human Preferences.

Erdem Biyik Nima Anari Dorsa Sadigh

Published in: CoRR (2024)

Keyphrases

active learning
reward function
batch mode
multiple agents
semi supervised
transition probabilities
reinforcement learning
state space
inverse reinforcement learning
image segmentation
collaborative filtering
particle filter
optimal policy
multi attribute
preference elicitation