Slovak Language Model from Internet Text Data.
Ján StasDaniel HládekMatús PlevaJozef JuhárPublished in: COST 2102 Training School (2010)
Keyphrases
- language model
- text data
- text classification
- language modeling
- text mining
- n gram
- document representation
- probabilistic model
- document retrieval
- structured data
- information retrieval
- high dimensional
- retrieval model
- document collections
- text documents
- query expansion
- text clustering
- high dimensional data
- mixture model
- test collection
- smoothing methods
- web pages
- knowledge discovery
- query terms
- data mining
- data sets
- web search
- natural language processing
- pattern recognition
- decision trees