Web Content Filtering Through Knowledge Distillation of Large Language Models.
Tamás VörösSean Paul BergeronKonstantin BerlinPublished in: WI/IAT (2023)
Keyphrases
- language model
- web content
- language modeling
- website
- probabilistic model
- n gram
- document retrieval
- speech recognition
- web pages
- query expansion
- retrieval model
- information retrieval
- language modelling
- test collection
- semantic browsing
- language model for information retrieval
- user generated
- pseudo relevance feedback
- web documents
- context sensitive
- web data
- search engine
- text categorization
- statistical language models
- language models for information retrieval
- vector space model
- hidden markov models
- term dependencies
- machine learning
- okapi bm
- ad hoc information retrieval
- web search engines