MLWIKIR: A Python Toolkit for Building Large-scale Wikipedia-based Information Retrieval Datasets in Chinese, English, French, Italian, Japanese, Spanish and More.
Jibril FrejDidier SchwabJean-Pierre ChevalletPublished in: CIRCLE (2020)
Keyphrases
- chinese english
- monolingual retrieval
- information retrieval
- wordnet
- document collections
- question answering
- information retrieval systems
- cross language retrieval
- linguistic resources
- text collections
- machine translation
- cross language information retrieval
- co occurrence
- knowledge base
- semantic information
- semantic relations
- language model
- machine translation system
- translation model
- natural language processing
- information extraction
- vector space model
- named entities
- query expansion
- semantic similarity
- text retrieval
- query translation
- relevance feedback
- document retrieval
- statistical machine translation
- artificial intelligence
- semantic features
- language independent
- text mining
- cross lingual