Enhance word representation for out-of-vocabulary on Ubuntu dialogue corpus.

Jianxiong Dong Jim Huang

Published in: CoRR (2018)

Keyphrases

out of vocabulary
spoken document retrieval
word segmentation
n gram
hand crafted
language model
parallel corpora
broadcast news
named entity recognition
cross language information retrieval
spoken term detection
statistical machine translation
dialogue system
multiword
machine learning
cross lingual
named entities
word sense
sentence level
term frequency
word pairs
spoken language
document retrieval
co occurrence
hidden markov models
natural language
keywords