Comparing web-crawled and traditional corpora.
Václav CvrcekZuzana KomrskováDavid LukesPetra PoukarováAnna RehorkováAdrian Jan ZasinaVladimír BenkoPublished in: Lang. Resour. Evaluation (2020)
Keyphrases
- web pages
- website
- web applications
- natural language processing
- semantic web
- information sources
- web documents
- web data
- web users
- end users
- linked data
- information resources
- link analysis
- learning algorithm
- web information retrieval
- world wide
- web logs
- web technologies
- online communities
- semi structured
- information extraction