Echotune: A Modular Extractor Leveraging the Variable-Length Nature of Speech in ASR Tasks.

Sizhou Chen Songyang Gao Sen Fang

Published in: CoRR (2023)

Keyphrases

variable length
automatic speech recognition
fixed length
speech recognition
speech signal
spontaneous speech
statistical dependencies
convolutional codes
n gram
noisy environments
speech synthesis
spoken words
speech corpus
language model
hidden markov models
text compression
speech sounds
bayesian networks
image segmentation
run length encoding
image processing
computer vision
machine learning