MCSCSet: A Specialist-annotated Dataset for Medical-domain Chinese Spelling Correction.
Wangjie JiangZhihao YeZijing OuRuihui ZhaoJianguang ZhengYi LiuSiheng LiBang LiuYujiu YangYefeng ZhengPublished in: CoRR (2022)
Keyphrases
- medical domain
- spelling correction
- english text
- context sensitive
- domain specific
- chinese characters
- clinical practice
- text mining
- word sense disambiguation
- medical records
- search queries
- information extraction
- artificial intelligence
- clinical trials
- pattern recognition
- image processing
- discriminative training
- log linear models
- database
- feature set
- similarity measure
- word segmentation