DUKweb: Diachronic word representations from the UK Web Archive corpus.
Adam TsakalidisPierpaolo BasileMarya BazziMihai CucuringuBarbara McGillivrayPublished in: CoRR (2021)
Keyphrases
- word frequencies
- text corpus
- multiword
- sentence level
- english words
- word pairs
- training corpus
- linguistic information
- unknown words
- natural language text
- lexical features
- n gram
- word sense
- noun phrases
- statistical machine translation
- parallel corpus
- writing style
- word frequency
- conversational speech
- word clouds
- machine translation system
- manually annotated
- word sense disambiguation
- co occurrence
- related words
- text corpora
- recognizing textual entailment
- word level
- search engine
- translation model
- social networks