Login / Signup
PLAtE: A Large-scale Dataset for List Page Web Extraction.
Aidan San
Jan Bakus
Colin Lockard
David M. Ciemiewicz
Yangfeng Ji
Sandeep Atluri
Kevin Small
Heba Elfardy
Published in:
CoRR (2022)
Keyphrases
</>
website
web pages
web information extraction
chinese web
web scale
data extraction
web browsing
web content
web applications
home page
web documents
news pages
million images
keywords
page content
linked data
web snippets
google search
real world
database
semantic web
web users
web mining
massive scale
page layout
information sources
web graph
web data
link structure
web news
content features
hyperlink structure
web communities
web technologies
user generated content
web search
information extraction
real life
web log mining