Login / Signup
Blending Reward Functions via Few Expert Demonstrations for Faithful and Accurate Knowledge-Grounded Dialogue Generation.
Wanyu Du
Yangfeng Ji
Published in:
CoRR (2023)
Keyphrases
</>
inverse reinforcement learning
knowledge acquisition
domain experts
human experts
expert knowledge
knowledge base
domain knowledge
reward function
transition probabilities
clustering algorithm
maximum entropy