Token-level Identification of Multiword Expressions using Pre-trained Multilingual Language Models.
Raghuraman SwaminathanPaul CookPublished in: MWE@EACL (2023)
Keyphrases
- language model
- multiword
- pre trained
- language modeling
- context sensitive
- document retrieval
- probabilistic model
- n gram
- speech recognition
- query expansion
- retrieval model
- information retrieval
- document representation
- cross lingual
- test collection
- translation model
- language independent
- digital libraries
- training data
- vector space model
- cross language
- machine learning
- neural network
- natural language processing
- decision trees
- learning algorithm