A Reappraisal of Sentence and Token Splitting for Life Sciences Documents.
Katrin TomanekJoachim WermterUdo HahnPublished in: MedInfo (2007)
Keyphrases
- life sciences
- sentence level
- information retrieval
- word frequency
- relevant documents
- information retrieval systems
- document level
- biological data
- scientific data
- data integration
- document collections
- automatic summarization
- document set
- xml documents
- natural language
- poses challenges
- multi document summarization
- text documents
- sentence similarity
- keywords
- sentiment polarity
- document clustering
- statistical analysis
- data management
- data analysis
- machine learning
- data mining
- text corpus
- data sets
- database