Between Comparable and Parallel: English-Czech Corpus from Wikipedia.
Adéla StromajerováVít BaisaMarek BlahusPublished in: RASLAN (2016)
Keyphrases
- cross language
- link grammar
- computing semantic relatedness
- explicit semantic analysis
- language independent
- person names
- broad coverage
- document collections
- wikipedia articles
- open domain
- statistical machine translation
- parallel corpus
- named entity disambiguation
- natural language
- world knowledge
- wide coverage
- cl sr
- cross lingual
- natural language text
- english words
- mono lingual
- english language
- knowledge base
- machine translation
- training corpus
- named entities
- text retrieval
- wordnet
- semantic relations
- answer questions
- question answering
- semantic roles
- topic tracking
- language learning
- automatically generated
- multiword
- text categorization
- unknown words
- parallel corpora
- text corpus
- word sense
- query translation
- manually generated
- tree bank
- semantic information
- noun phrases
- linguistic features
- natural language processing
- search engine