Publication: Enriching a document collection by integrating information extraction and PDF annotation.