Estimation of speaker age and height from speech signal using bi-encoder transformer mixture model.
Tarun GuptaDuc-Tuan TruongTran The AnhEng Siong ChngPublished in: INTERSPEECH (2022)
Keyphrases
- mixture model
- gaussian mixture model
- speech signal
- speaker recognition
- speech recognition
- speaker identification
- density estimation
- automatic speech recognition
- language model
- vocal tract
- mel frequency cepstral coefficients
- em algorithm
- automatic speech recognition systems
- probabilistic model
- acoustic features
- expectation maximization
- generative model
- unsupervised learning
- maximum likelihood
- noisy environments
- model selection
- maximum likelihood estimation
- probability density function
- speaker verification
- hidden markov models
- bit rate
- bayesian information criterion
- broadcast news
- machine learning
- pattern recognition
- feature extraction
- gaussian distribution
- blind source separation
- motion estimation
- high dimensional
- fundamental frequency
- speech quality
- feature selection