Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation.
Yi LuoNima MesgaraniPublished in: IEEE ACM Trans. Audio Speech Lang. Process. (2019)
Keyphrases
- speech recognition
- speech signal
- recognition engine
- frequency domain
- audio visual
- text to speech
- speech synthesis
- automatic speech recognition
- spoken language
- endpoint detection
- signal processing
- human visual system
- fourier transform
- wavelet transform
- audio stream
- multiscale
- wavelet packet
- multi stream
- speech processing
- automatic speech recognition systems
- speaker recognition
- spoken dialogue systems
- broadcast news
- data sets
- hidden markov models
- color images
- multiresolution
- neural network