The Zero Resource Speech Benchmark 2021: Metrics and baselines for unsupervised spoken language modeling.
Tu Anh NguyenMaureen de SeysselPatricia RozéMorgane RivièreEvgeny KharitonovAlexei BaevskiEwan DunbarEmmanuel DupouxPublished in: CoRR (2020)
Keyphrases
- language modeling
- speech recognition
- language model
- automatic speech recognition
- trec collections
- spoken language
- document level
- finite state transducers
- retrieval model
- speech signal
- n gram
- information retrieval
- probabilistic model
- query expansion
- document retrieval
- word error rate
- broadcast news
- cross lingual
- relevance model
- unsupervised learning
- test collection
- statistical language models
- statistical language modeling
- semi supervised
- term weighting
- text classification
- hidden markov models
- query terms
- mixture model
- smoothing methods
- vector space model
- topic modeling
- text mining
- relevance feedback
- database systems
- search engine
- improvements in retrieval effectiveness
- machine learning