Towards Transparency: Exploring LLM Trainings Datasets through Visual Topic Modeling and Semantic Frame.
Charles de DampierreAndrei MogoutovNicolas BaumardPublished in: CoRR (2024)
Keyphrases
- topic modeling
- topic models
- text mining
- latent dirichlet allocation
- text classification
- modeling framework
- semantic search
- topic extraction
- latent topics
- tag information
- semantic similarity
- collaborative filtering
- text corpora
- text documents
- latent semantic analysis
- semantic information
- association rules
- probabilistic latent semantic analysis
- probabilistic topic models
- real world