Data Generation Using Large Language Models for Text Classification: An Empirical Case Study.
Yinheng LiRogerio BonattiSara AbdaliJustin WagleKazuhito KoishidaPublished in: CoRR (2024)
Keyphrases
- data generation
- language model
- text classification
- language modeling
- co training
- n gram
- document retrieval
- statistical language modeling
- text categorization
- active learning
- probabilistic model
- data streams
- language modelling
- query expansion
- retrieval model
- bag of words
- machine learning
- streaming data
- test collection
- statistical language models
- text mining
- naive bayes
- labeled data
- information retrieval
- smoothing methods
- feature selection
- relevance model
- text classifiers
- knn
- text documents
- multi label
- unsupervised learning
- data sets
- unlabeled data
- change detection
- high throughput
- transfer learning
- language models for information retrieval