Naturalization of Text by the Insertion of Pauses and Filler Words.
Richa SharmaParth Vipul ShahAshwini M. JoshiPublished in: CoRR (2020)
Keyphrases
- text documents
- keywords
- english words
- text recognition
- proper nouns
- short text
- chinese text
- related words
- text corpus
- text corpora
- linguistic analysis
- world knowledge
- natural language text
- lexical features
- arabic text
- lexical information
- chinese texts
- word frequency
- textual features
- text mining
- text databases
- document level
- text retrieval
- syntactic analysis
- information retrieval
- syntactic categories
- word pairs
- multiword
- noun phrases
- unknown words
- punctuation marks
- n gram
- free text
- text representation
- syntactic structures
- training corpus
- keyword extraction
- printed text
- word sense
- word sense disambiguation
- arabic language
- printed documents
- hidden markov models
- historical documents
- spontaneous speech
- information extraction
- linguistic information
- word segmentation
- speech signal
- text classification
- bag of words