Login / Signup
The ELTE.DH Pilot Corpus - Creating a Handcrafted Gigaword Web Corpus with Metadata.
Balázs Indig
Árpád Knap
Zsófia Sárközi-Lindner
Mária Timári
Gábor Palkó
Published in:
WAC@LREC (2020)
Keyphrases
</>
metadata
databases
web pages
manually annotated
semantic web
database
digital libraries
hand crafted
web applications
search interface
e learning
linguistic features
end users
natural language processing
text classification
learning resources
linked data
plain text
open access
specific domains
web page content