Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective.
Huayang LiTian LanZihao FuDeng CaiLemao LiuNigel CollierTaro WatanabeYixuan SuPublished in: CoRR (2023)
Keyphrases
- data sets
- database
- data collection
- data points
- data distribution
- data processing
- small number
- training data
- high quality
- data analysis
- original data
- prior knowledge
- data sources
- knowledge discovery
- text mining
- neural network
- missing data
- high dimensional data
- synthetic data
- social networks
- data quality
- raw data
- statistical analysis
- learning algorithm
- feature selection
- viewpoint
- data structure