Echotune: A Modular Extractor Leveraging the Variable-Length Nature of Speech in ASR Tasks.
Sizhou ChenSongyang GaoSen FangPublished in: CoRR (2023)
Keyphrases
- variable length
- automatic speech recognition
- fixed length
- speech recognition
- speech signal
- spontaneous speech
- statistical dependencies
- convolutional codes
- n gram
- noisy environments
- speech synthesis
- spoken words
- speech corpus
- language model
- hidden markov models
- text compression
- speech sounds
- bayesian networks
- image segmentation
- run length encoding
- image processing
- computer vision
- machine learning