A Comparative Evaluation of a New Unsupervised Sentence Boundary Detection Approach on Documents in English and Portuguese.
Jan StrunkCarlos Nascimento Silla Jr.Celso A. A. KaestnerPublished in: CICLing (2006)
Keyphrases
- comparative evaluation
- boundary detection
- parallel corpus
- cross language
- natural language
- source language
- stop words
- brazilian portuguese
- training corpus
- document collections
- document retrieval
- scoring methods
- document set
- image segmentation
- detection algorithm
- machine translation
- information retrieval
- sentence level
- document level
- object detection and recognition
- text retrieval
- target language
- web documents
- information retrieval systems
- statistical machine translation
- parallel corpora
- text documents
- relevant documents
- query translation
- supervised learning
- berkeley segmentation dataset
- syntactic categories
- part of speech
- cross lingual
- cross language information retrieval
- vector space model
- retrieval systems
- semantic roles
- pronominal anaphora
- parse tree
- question answering
- semi supervised
- image processing
- noun phrases
- text summarization
- information extraction
- computer vision