TEI P5 as a Text Encoding Standard for Multilevel Corpus Annotation.
Piotr BanskiAdam PrzepiórkowskiPublished in: DH (2010)
Keyphrases
- broad coverage
- open domain
- supervised machine learning
- text data
- annotated corpus
- inter annotator agreement
- newspaper articles
- image retrieval
- lexical features
- automatically created
- text corpus
- training corpus
- cross media
- text collections
- free text
- text mining
- keywords
- manually annotated
- noun phrases
- news stories
- world knowledge
- text retrieval
- text documents
- text classification
- scientific papers
- active learning
- document corpus
- named entity disambiguation