How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation.
Chia-Wei LiuRyan LoweIulian Vlad SerbanMichael NoseworthyLaurent CharlinJoelle PineauPublished in: CoRR (2016)
Keyphrases
- dialogue system
- evaluation metrics
- dialogue management
- human computer
- precision and recall
- spoken dialogue systems
- natural language
- natural language generation
- mixed initiative
- dialogue manager
- spoken language
- tutorial dialogue
- learning to rank
- evaluation measures
- language understanding
- human users
- user model
- dialogue games
- information retrieval
- semi supervised
- information extraction
- supervised learning
- pairwise
- digital libraries