Optimizing Corpus Creation for Training Word Embedding in Low Resource Domains: A Case Study in Autism Spectrum Disorder (ASD).
Yang GuGondy LeroySydney PettygroveMaureen K. GalindoMargaret Kurzius-SpencerPublished in: AMIA (2018)
Keyphrases
- autism spectrum disorder
- training corpus
- qualitative analysis
- high risk
- test set
- word frequencies
- text corpus
- unknown words
- social behavior
- statistical machine translation
- quantitative analysis
- sentence level
- co occurrence
- linguistic information
- n gram
- training set
- word sense
- content analysis
- multiword
- natural language text
- text classification
- visual perception
- parallel corpus
- english words
- spontaneous speech
- collaborative learning
- decision support system
- risk factors