CrowdSpeech and Vox DIY: Benchmark Dataset for Crowdsourced Audio Transcription.
Nikita PavlichenkoIvan StelmakhDmitry UstalovPublished in: NeurIPS Datasets and Benchmarks (2021)
Keyphrases
- benchmark datasets
- automatic transcription
- multimedia
- low quality
- video annotation
- audio visual
- audio signals
- visual information
- spontaneous speech
- signal processing
- audio video
- cross modal
- speech recognition technology
- visual data
- computer vision
- multimedia information
- terms of classification accuracy
- audio recordings