Integrated Segmentation and Recognition of Mixed Chinese/English Document.
Yong XiaBaihua XiaoChun-Heng WangRuwei DaiPublished in: ICDAR (2007)
Keyphrases
- chinese english
- document analysis
- text lines
- linguistic resources
- word segmentation
- document images
- web documents
- out of vocabulary
- machine translation
- document collections
- information retrieval systems
- retrieval systems
- information retrieval
- document representation
- character recognition
- word level
- cross language information retrieval
- text collections
- tf idf
- document clustering
- translation model
- document retrieval
- conditional random fields
- digital libraries
- keywords