Arena Learning: Build Data Flywheel for LLMs Post-training via Simulated Chatbot Arena.
Haipeng LuoQingfeng SunCan XuPu ZhaoQingwei LinJianguang LouShifeng ChenYansong TangWeizhu ChenPublished in: CoRR (2024)
Keyphrases
- database
- prior knowledge
- training data
- data sets
- supervised learning
- input data
- reinforcement learning
- data analysis
- synthetic data
- data sources
- training samples
- learning models
- data quality
- original data
- raw data
- human experts
- training dataset
- high quality
- high dimensional data
- unsupervised learning
- data collection
- online learning
- machine learning
- learning algorithm
- knowledge discovery
- active learning
- social networks
- xml documents