Creating an Aligned Corpus of Sound and Text: The Multimodal Corpus of Shakespeare and Milton.
Manex AgirrezabalPublished in: CoRR (2024)
Keyphrases
- supervised machine learning
- open domain
- text data
- newspaper articles
- broad coverage
- text corpora
- plain text
- multiword
- linguistic patterns
- text corpus
- recognizing textual entailment
- natural language text
- topic segmentation
- training corpus
- world knowledge
- lexical features
- english words
- information extraction systems
- word pairs
- text collections
- linguistic information
- document level
- sentence level
- conversational speech
- word sense
- noun phrases
- spontaneous speech
- test set
- information extraction
- named entity disambiguation
- anaphora resolution
- syntactic features
- computational linguistics
- natural language processing