How Many Crowd Workers Do I Need? On Statistical Power when Crowdsourcing Relevance Judgments.
Kevin RoiteroDavid La BarberaMichael SopranoGianluca DemartiniStefano MizzaroTetsuya SakaiPublished in: ACM Trans. Inf. Syst. (2024)
Keyphrases
- relevance judgments
- statistical power
- amazon mechanical turk
- user feedback
- crowd sourced
- sample size
- statistical significance
- relevance feedback
- user interaction
- test collection
- learning to rank
- web search engines
- average precision
- retrieval systems
- interaction effects
- user preferences
- recommender systems
- relevant documents
- user profiles
- retrieval effectiveness
- relevance assessments
- information retrieval
- ranking functions
- evaluation measures
- evaluation metrics
- information retrieval systems
- web search
- active learning
- statistically significant
- learning algorithm
- upper bound
- registration errors
- pairwise
- learning environment
- machine learning