Publication: PDF text classification to leverage information extraction from publication reports.