Towards better structured and less noisy Web data: Oscar with Register annotations.
Veronika LaippalaAnna SalmelaSamuel RönnqvistAlham Fikri AjiLi-Hsin ChangAsma DhifallahLarissa GoulartHenna KortelainenMarc PàmiesDeise Prina DutraValtteri SkantsiLintang SutawikaSampo PyysaloPublished in: W-NUT@COLING (2022)
Keyphrases
- incremental mining
- web data
- web mining
- semi structured
- structured data
- web pages
- web usage mining
- web information
- web content
- web documents
- web crawling
- metadata
- web sources
- data repositories
- web information extraction
- deep web
- active learning
- real world
- databases
- query logs
- link structure
- social network analysis
- information extraction
- relational databases
- data sets