How To Evaluate Your Dialogue System: Probe Tasks as an Alternative for Token-level Evaluation Metrics.
Prasanna ParthasarathiJoelle PineauSarath ChandarPublished in: CoRR (2020)
Keyphrases
- evaluation metrics
- dialogue system
- precision and recall
- average precision
- human computer
- dialogue management
- natural language generation
- human users
- spoken dialogue systems
- evaluation framework
- natural language
- learning to rank
- evaluation measures
- human robot
- spoken language
- multimedia
- tutorial dialogue
- language understanding
- user interaction
- digital libraries
- support vector