Linguistic Ethnography: Identifying Dominant Word Classes in Text.
Rada MihalceaStephen G. PulmanPublished in: CICLing (2009)
Keyphrases
- linguistic information
- natural language text
- syntactic analysis
- semantic information
- linguistic knowledge
- word order
- lexical semantics
- text mining
- english words
- text input
- word meaning
- english text
- related words
- natural language
- natural language generation
- word pairs
- keywords
- text corpus
- language generation
- information retrieval
- multiword
- n gram
- text generation
- lexical features
- compressed text
- co occurrence
- printed documents
- word level
- sentence similarity
- linguistic features
- sentence level
- string matching
- linguistic analysis
- chinese text
- information extraction
- page layout
- printed text
- text segments
- word counts
- text retrieval
- stop words
- semantic network
- semantic representations
- text classification
- word segmentation
- syntactic structures
- natural language processing
- relation extraction
- text processing
- concept space
- machine translation system
- grounded theory
- punctuation marks