On the Construction of Multilingual Corpora for Clinical Text Mining.
Fabián VillenaUrs EisenmannPetra KnaupJocelyn DunstanMatthias GanzingerPublished in: MIE (2020)
Keyphrases
- text mining
- natural language processing
- text corpora
- medical domain
- text data
- information extraction
- cross lingual
- text classification
- text documents
- data mining
- digital libraries
- textual documents
- unstructured information
- parallel corpus
- biomedical literature
- knowledge discovery
- information retrieval
- text clustering
- traditional chinese medicine
- construction process
- clinical data
- comparable corpora
- wide coverage
- textual data
- document clustering
- topic models
- clinical trials
- web mining
- named entities
- wordnet
- machine learning
- computational linguistics
- patient data
- topic modeling
- clinical applications
- latent dirichlet allocation