Towards the Corpus of Latvian Romani Texts: Deciphering the Manuscripts in Jānis Leimanis' Archive.
Natalia PerkovaKirill KozhanovPublished in: DHNB (2022)
Keyphrases
- natural language text
- training corpus
- newspaper articles
- information extraction systems
- english words
- world knowledge
- text corpus
- cultural heritage
- information extraction
- manually annotated
- linguistic patterns
- word sense
- test set
- linguistic information
- scientific papers
- writing style
- text documents
- keywords
- linguistic features
- natural language
- digital archives
- machine learning
- text corpora
- relation extraction
- data sets
- coreference resolution
- text processing
- open domain
- textual features
- automatic extraction
- free text
- search engine
- historical documents
- spoken dialog
- chinese texts
- genia corpus