Tokenization Consistency Matters for Generative Models on Extractive NLP Tasks.
Kaiser SunPeng QiYuhao ZhangLan LiuWilliam Yang WangZhiheng HuangPublished in: CoRR (2022)
Keyphrases
- generative model
- hierarchical hidden markov models
- mixture model
- information extraction
- probabilistic model
- text summarization
- em algorithm
- conditional random fields
- discriminative models
- natural language processing
- hidden variables
- discriminative learning
- prior knowledge
- representational power
- named entities
- object categories
- semi supervised
- maximum entropy principle
- information retrieval
- natural language
- semi supervised learning
- global constraints