Classifying Textual Components of Bilingual Documents with Decision-Tree Support Vector Machines.
Xiao-Rong LinChien-Yang GuoFu ChangPublished in: ICDAR (2011)
Keyphrases
- support vector
- decision trees
- logistic regression
- learning machines
- metadata
- free text
- information retrieval
- textual information
- pdf documents
- large margin classifiers
- keywords
- machine learning
- document collections
- textual features
- information retrieval systems
- parallel corpora
- kernel function
- multiword
- document clustering
- text content
- textual data
- text documents
- textual case based reasoning
- document classification
- support vector machine
- document retrieval
- web documents
- manually constructed
- cross language
- natural language
- relevant documents
- multimedia
- parallel corpus
- cross language information retrieval
- classification accuracy
- information extraction
- source language
- word pairs
- digital libraries
- machine translation