A Lexical Resource for the Identification of "Weak Words" in German Specification Documents.
Jennifer KrischMelanie DickRonny JauchUlrich HeidPublished in: LREC (2016)
Keyphrases
- word frequency
- linguistic information
- word pairs
- keywords
- text documents
- text corpus
- word similarity
- sentiment polarity
- linguistic analysis
- natural language text
- word frequencies
- word sense disambiguation
- syntactic information
- wordnet
- document space
- word spotting
- unknown words
- related words
- text corpora
- word meaning
- multiword
- lexical information
- word sense
- information retrieval systems
- syntactic categories
- lexical chains
- document representation
- word forms
- lexical features
- document similarity
- chinese text
- information retrieval
- semantic relationships
- document clustering
- semantic similarity
- bilingual dictionaries
- document content
- semantic relations
- automatic summarization
- document retrieval
- semantic information
- natural language processing
- text mining
- keyword extraction
- sentiment classification
- parallel corpora
- text categorization
- topic hierarchy
- bag of words
- co occurrence
- part of speech
- arabic documents
- relevant documents
- search engine
- knowledge resources
- document collections
- n gram
- cross language
- natural language
- metadata
- query expansion
- topic models
- web pages
- document level