Addressing Index Collapse of Large-Codebook Speech Tokenizer with Dual-Decoding Product-Quantized Variational Auto-Encoder.
Haohan GuoFenglong XieDongchao YangHui LuXixin WuHelen MengPublished in: CoRR (2024)
Keyphrases
- decoding process
- error control
- vector quantization
- vector quantized
- rate distortion
- image segmentation
- speech signal
- speech recognition
- noisy channel
- pixel domain
- index structure
- finite state transducers
- audio visual
- turbo codes
- decoding algorithm
- text to speech
- automatic speech recognition
- vector quantizer
- bag of words
- image representation
- optical flow
- feature vectors
- bit rate
- motion estimation
- computational complexity