PDF to HTML Conversion: Having a Usable Web Document.
Muhammad Afzal BhattiAdeel AhmadPublished in: ICDIM (2006)
Keyphrases
- web documents
- html documents
- web pages
- semi structured
- information extraction
- probability density function
- website
- search engine
- web search engines
- textual information
- keywords
- pdf files
- web search
- probability distribution function
- document representation
- vector space model
- web data
- prefetching
- web browser
- web content
- mixture model
- web logs
- structured documents
- data extraction
- statistical methods
- probability distribution
- dom tree
- probability distribution functions