The PAISÀ Corpus of Italian Web Texts.
Verena LydingEgon StemleClaudia BorghettiMarco BrunelloSara CastagnoliFelice Dell'OrlettaHenrik DittmannAlessandro LenciVito PirrelliPublished in: WaC@EACL (2014)
Keyphrases
- newspaper articles
- legal texts
- textual features
- website
- web documents
- web applications
- information extraction systems
- linked data
- manually annotated
- web resources
- specific domains
- world knowledge
- linguistic information
- web content
- semantic web
- web technologies
- natural language text
- web pages
- database
- multiword
- web data
- text documents
- test set
- user interface