Login / Signup

Blending Reward Functions via Few Expert Demonstrations for Faithful and Accurate Knowledge-Grounded Dialogue Generation.

Wanyu DuYangfeng Ji
Published in: CoRR (2023)
Keyphrases
  • inverse reinforcement learning
  • knowledge acquisition
  • domain experts
  • human experts
  • expert knowledge
  • knowledge base
  • domain knowledge
  • reward function
  • transition probabilities
  • clustering algorithm
  • maximum entropy