htmldate: A Python package to extract publication dates from web pages.
Adrien BarbaresiPublished in: J. Open Source Softw. (2020)
Keyphrases
- web pages
- website
- search engine
- keywords
- programming language
- web search
- web documents
- web page classification
- open source
- web information extraction
- open source software
- web content
- data extraction
- software package
- automatic extraction
- web data
- web search engines
- general purpose
- data records
- geographic information
- digital libraries
- high level
- google search engine
- web content mining