Cem Mil Podcasts: A Spoken Portuguese Document Corpus.
Edgar TanakaAnn CliftonJoana CorreiaSharmistha JatRosie JonesJussi KarlgrenWinstead ZhuPublished in: CoRR (2022)
Keyphrases
- document corpus
- multiple instance learning
- information retrieval
- document clustering
- speech recognition
- higher education
- professional development
- keywords
- supervised learning
- blended learning
- contingency tables
- cross language
- topic detection
- multi instance learning
- image classification
- multi class
- semi supervised learning
- bag of words
- active learning
- automatic speech recognition
- training set