Harvest - An Open Source Toolkit for Extracting Posts and Post Metadata from Web Forums.
Albert WeichselbraunAdrian M. P. BrasoveanuRoger WaldvogelFabian OdoniPublished in: WI/IAT (2020)
Keyphrases
- web forums
- open source
- user generated content
- metadata
- web mining
- social media
- open standards
- open source software
- digital libraries
- metadata management
- structured data
- source code
- scripting language
- recommender systems
- learning resources
- multimedia
- case study
- user generated
- databases
- dublin core
- search tools
- web crawling
- information resources
- focused crawling
- resource discovery
- digital documents
- communication channels
- web crawler
- metadata extraction