Evaluating Evaluation Measures with Worst-Case Confidence Interval Widths.
Tetsuya SakaiPublished in: EVIA@NTCIR (2017)
Keyphrases
- evaluation measures
- confidence intervals
- worst case
- roc curve
- sample size
- learning to rank
- precision and recall
- markov chain
- test set
- evaluation metrics
- retrieval systems
- ranked list
- upper bound
- monte carlo
- lower bound
- np hard
- data mining
- information retrieval systems
- data sets
- conditional probabilities
- active learning