LongFNT: Long-Form Speech Recognition with Factorized Neural Transducer.
Xun GongYu WuJinyu LiShujie LiuRui ZhaoXie ChenYanmin QianPublished in: ICASSP (2023)
Keyphrases
- speech recognition
- hidden markov models
- speech signal
- speech processing
- language model
- speaker identification
- pattern recognition
- automatic speech recognition
- speech recognition systems
- speech synthesis
- noisy environments
- speech recognizer
- speech understanding
- handwriting recognition
- keyword spotting
- speech recognition technology
- speech retrieval
- speech recognition errors
- speech recognizers
- neural network
- speaker dependent
- statistically significant
- visual features
- image processing
- computer vision