Login / Signup
Generating Self-Contained and Summary-Centric Question Answer Pairs via Differentiable Reward Imitation Learning.
Li Zhou
Kevin Small
Yong Zhang
Sandeep Atluri
Published in:
EMNLP (1) (2021)
Keyphrases
</>
imitation learning
reinforcement learning
question answer pairs
robotic systems
humanoid robot
maximum margin
question answering
computer vision
multi modal
training data
xml documents
function approximation
reinforcement learning algorithms
control problems
average reward