From archive to corpus: transcription and annotation in the creation of signed language corpora.
Trevor JohnstonPublished in: PACLIC (2008)
Keyphrases
- annotated corpus
- parallel corpus
- hand crafted
- text corpora
- metadata
- comparable corpora
- programming language
- linguistic patterns
- language learning
- named entities
- relation extraction
- automatic annotation
- topic segmentation
- wide coverage
- cross lingual
- linguistic features
- computational linguistics
- linguistic knowledge
- text corpus
- statistical machine translation
- natural language
- language independent
- cross language information retrieval
- text data
- image annotation
- natural language processing
- machine translation
- spanish language
- training corpus
- semantic annotation
- word frequency
- text categorization
- document corpus
- automatic transcription
- target language
- inter annotator agreement
- news corpus
- wordnet
- news articles
- named entity recognition
- word pairs
- multiword
- text collections