Cross-Care: Assessing the Healthcare Implications of Pre-training Data on Language Model Bias.
Shan ChenJack GallifantMingye GaoPedro MoreiraNikolaj MunchAjay MuthukkumarArvind RajanJaya KolluriAmelia FiskeJanna HastingsHugo J. W. L. AertsBrian AnthonyLeo Anthony CeliWilliam G. La CavaDanielle S. BittermanPublished in: CoRR (2024)
Keyphrases
- language model
- training data
- home care
- health care
- language modeling
- document retrieval
- n gram
- probabilistic model
- information retrieval
- patient care
- language modelling
- training set
- retrieval model
- speech recognition
- decision trees
- test collection
- vector space model
- learning algorithm
- mixture model
- statistical language models
- query expansion
- smoothing methods
- supervised learning
- ad hoc information retrieval
- context sensitive
- classification accuracy
- language model for information retrieval
- naive bayes
- translation model
- labeled data
- relevance model
- generative model
- feature selection
- query terms
- language models for information retrieval
- retrieval systems