Better Than Whitespace: Information Retrieval for Languages without Custom Tokenizers.
Odunayo OgundepoXinyu ZhangJimmy LinPublished in: CoRR (2022)
Keyphrases
- information retrieval
- domain specific
- multi lingual
- information retrieval systems
- expressive power
- relevant documents
- multilingual information retrieval
- databases
- language independent
- document collections
- structured queries
- document retrieval
- query expansion
- test collection
- text mining
- information access
- neural network
- search engine
- language model
- question answering
- information seeking
- query terms
- digital libraries
- query translation
- target language
- computational linguistics
- language identification
- boolean queries
- text classification