Login / Signup
Nano: Nested Human-in-the-Loop Reward Learning for Few-shot Language Model Control.
Xiang Fan
Yiwei Lyu
Paul Pu Liang
Ruslan Salakhutdinov
Louis-Philippe Morency
Published in:
CoRR (2022)
Keyphrases
</>
language model
language modeling
reinforcement learning
speech recognition
prior knowledge
document retrieval
statistical language models
n gram
information retrieval
image retrieval
active learning
probabilistic model
supervised learning
text classification
mixture model
retrieval model