How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation.
Chia-Wei LiuRyan LoweIulian SerbanMichael NoseworthyLaurent CharlinJoelle PineauPublished in: EMNLP (2016)
Keyphrases
- query expansion
- dialogue system
- evaluation metrics
- dialogue management
- human computer
- precision and recall
- dialogue manager
- natural language
- spoken language
- mixed initiative
- spoken dialogue systems
- natural language generation
- tutorial dialogue
- evaluation measures
- human users
- learning to rank
- user model
- semi supervised
- supervised learning
- language understanding
- active learning
- machine learning