Comparing MFCC and MPEG-7 audio features for feature extraction, maximum likelihood HMM and entropic prior HMM for sports audio classification.
Ziyou XiongRegunathan RadhakrishnanAjay DivakaranThomas S. HuangPublished in: ICME (2003)
Keyphrases
- mel frequency cepstral coefficients
- audio features
- feature extraction
- speech signal
- feature set
- hidden markov models
- speech recognition
- speaker identification
- maximum likelihood
- acoustic features
- feature vectors
- genre classification
- audio visual
- low level
- automatic speech recognition
- music genre classification
- classification accuracy
- multimedia
- visual features
- extracting features
- pattern recognition
- automatic music genre classification
- audio stream
- image classification
- feature space
- gaussian mixture model
- audio signal
- speaker recognition
- spectral features
- feature selection
- decision trees
- audio content
- noisy environments
- music information retrieval
- face recognition
- image processing
- multi modal
- extracted features
- visual speech
- visual information
- speaker diarization
- broadcast news
- sound source
- expectation maximization
- language model
- high level
- metadata