Sloparl - slovenian parliamentary speech and text corpus for large vocabulary continuous speech recognition.
Andrej ZgankTomaz RotovnikMatej GrasicMarko KosDamjan VlajZdravko KacicPublished in: INTERSPEECH (2006)
Keyphrases
- text corpus
- grapheme to phoneme conversion
- automatic speech recognition
- speech recognizer
- speech recognition
- text corpora
- natural language processing
- named entities
- spoken language
- text documents
- speech signal
- broadcast news
- text mining
- topic models
- training corpus
- probabilistic model
- training data
- text collections
- background knowledge
- computational linguistics
- wikipedia articles
- knowledge discovery
- digital libraries