Detecting automatically the layout of clinical documents to enhance the performances of downstream natural language processing.
Christel GérardinPerceval WajsbürtBasile DuraAlice CalligerAlexandre MouchetXavier TannierRomain BeyPublished in: CoRR (2023)
Keyphrases
- natural language processing
- free text
- machine learning
- portuguese language
- human readable
- patient records
- text analysis
- text mining
- document collections
- information extraction
- medical records
- text documents
- linguistic analysis
- textual data
- information retrieval
- wordnet
- knowledge representation
- page layout
- xml documents
- natural language
- information retrieval systems
- patient data
- computational linguistics
- document classification
- automatically generated
- document clustering
- document image retrieval
- pre classified
- relevant documents
- semantic information
- text fragments
- text processing
- co occurrence
- artificial intelligence
- vector space model
- question answering
- web documents
- machine translation
- semantic relations
- word sense disambiguation
- medical data
- semantic analysis
- computational biology
- clinical practice
- supply chain
- document analysis
- text summarization
- named entity recognition
- helping users
- extraction rules
- document retrieval
- retrieved documents
- part of speech