Standardizing Language with Word Embeddings and Language Modeling in Reports of Near Misses in Seveso Industries.
Simone BrunoSilvia Maria AnsaldiPatrizia AgnelloFabio Massimo ZanzottoPublished in: CLiC-it (2019)
Keyphrases
- language modeling
- n gram
- language model
- cross lingual
- parallel corpus
- term weighting
- word segmentation
- retrieval model
- chinese text retrieval
- translation model
- comparable corpora
- statistical language modeling
- query expansion
- information retrieval
- vector space
- probabilistic model
- text classification
- linguistic knowledge
- target language
- language independent
- relevance model
- distance measure
- natural language
- improvements in retrieval effectiveness
- test collection
- document retrieval
- tf idf
- retrieval effectiveness
- bilingual dictionaries
- term dependencies
- query terms