Document hierarchies from text and links.
Qirong HoJacob EisensteinEric P. XingPublished in: WWW (2012)
Keyphrases
- text documents
- digital documents
- document analysis
- web documents
- document processing
- keywords
- information retrieval
- text content
- textual content
- document content
- page layout analysis
- textual documents
- text clustering
- semantic information
- scientific documents
- multimedia documents
- text mining
- document categorization
- technical papers
- text collections
- database
- document images
- electronic documents
- extractive summarization
- document corpus
- automatic text summarization
- retrieval engine
- printed documents
- text retrieval
- document structure
- wikipedia articles
- text corpus
- information retrieval systems
- authorship attribution
- scientific papers
- document classification
- text categorization
- document representation
- digital libraries
- link analysis
- vector space model
- noun phrases
- link structure
- structured documents
- latent semantic analysis
- named entities
- document collections
- wikipedia pages
- document retrieval
- printed text
- textual data
- text summarization
- temporal expressions
- keyword extraction
- web pages
- scanned documents
- text representation
- document clustering
- information extraction
- text data
- automatic summarization
- related documents
- text classifiers
- word level
- anchor text
- wordnet
- co occurrence
- document level