Towards Automatic Evaluation of Dialog Systems: A Model-Free Off-Policy Evaluation Approach.

Haoming Jiang Bo Dai Mengjiao Yang Wei Wei Tuo Zhao

Published in: CoRR (2021)

Keyphrases

automatic evaluation
policy evaluation
model free
dialog systems
natural language generation
reinforcement learning
natural language
temporal difference
policy iteration
reinforcement learning algorithms
function approximation
human judgments
learning algorithm
multi agent
information extraction
least squares
knowledge base
e learning