Using a boosted tree classifier for text segmentation in hand-annotated documents.
Xujun PengSrirangaraj SetlurVenu GovindarajuRamachandrula SitaramPublished in: Pattern Recognit. Lett. (2012)
Keyphrases
- text segmentation
- document set
- sentence level
- information retrieval
- document collections
- relevant documents
- text lines
- document clustering
- web documents
- retrieval systems
- test collection
- multi document summarization
- information retrieval systems
- text documents
- document retrieval
- keywords
- clustering method
- vector space model
- machine learning
- novelty detection
- textual information
- graph model
- character recognition
- ranked list
- sentiment analysis
- user queries