The impact of lacking metadata for the measurement of cultural and linguistic change using the Google Ngram data sets - Reconstructing the composition of the German corpus in times of WWII.
Alexander KoplenigPublished in: Digit. Scholarsh. Humanit. (2017)
Keyphrases
- n gram
- data sets
- metadata
- language model
- digital libraries
- linguistic features
- part of speech
- database
- linguistic information
- search engine
- linguistic patterns
- reference resolution
- website
- training data
- digital archives
- natural language processing
- natural language text
- inside outside algorithm
- measurement data
- language independent
- information retrieval
- text data
- cross cultural
- multiword
- machine translation
- machine translation system
- learning resources
- probabilistic model
- social media
- language specific
- multimedia
- neural network