MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech Recognition.
Xiaohuan ZhouJiaming WangZeyu CuiShiliang ZhangZhijie YanJingren ZhouChang ZhouPublished in: CoRR (2022)
Keyphrases
- multi modal
- speech recognition
- multi task
- learning tasks
- hidden markov models
- language model
- multi class
- speech signal
- learning problems
- multi modality
- pattern recognition
- transfer learning
- automatic speech recognition
- bit rate
- supervised learning
- training set
- audio visual
- motion estimation
- speaker identification
- high dimensional
- discriminative training
- learning experience
- feature set
- active learning
- reinforcement learning
- image processing