Multilingual Open Text 1.0: Public Domain News in 44 Languages.
Chester Palen-MichelJune KimConstantine LignosPublished in: CoRR (2022)
Keyphrases
- multi lingual
- language specific
- language independent
- cross lingual
- multilingual documents
- web news
- news articles
- multilingual information retrieval
- indian languages
- text summarization
- keywords
- machine translation system
- cross media
- english text
- news stories
- financial news
- text retrieval
- text generation
- short texts
- language resources
- natural language
- text documents
- cross lingual information retrieval
- language identification
- machine translation
- information access
- arabic language
- native language
- comparable corpora
- information retrieval
- cross language
- news sources
- expressive power
- news video
- digital libraries
- online news
- broadcast news
- language modeling
- text categorization
- manually constructed
- n gram
- textual content
- natural language generation