Reproducing Whisper-Style Training Using An Open-Source Toolkit And Publicly Available Data.
Yifan PengJinchuan TianBrian YanDan BerrebbiXuankai ChangXinjian LiJiatong ShiSiddhant AroraWilliam ChenRoshan S. SharmaWangyou ZhangYui SudoMuhammad ShakeelJee-Weon JungSoumi MaitiShinji WatanabePublished in: ASRU (2023)
Keyphrases
- open source
- data sets
- high quality
- synthetic data
- data analysis
- original data
- neural network
- raw data
- data sources
- data collection
- data distribution
- training samples
- statistical analysis
- small number
- complex data
- relational databases
- database
- big data
- open source software
- labelled data
- training examples
- data processing
- data points
- end users
- data structure
- databases