MEReQ: Max-Ent Residual-Q Inverse RL for Sample-Efficient Alignment from Intervention.

Published in: CoRR (2024)

Keyphrases