Differences in the audio-visual detection of word prominence from Japanese and English speakers.
Martin HeckmannKeisuke NakamuraKazuhiro NakadaiPublished in: AVSP (2013)
Keyphrases
- audio visual
- multi modal
- native speakers
- multi stream
- visual data
- visual information
- audio visual speech recognition
- person authentication
- co occurrence
- n gram
- machine translation
- speech recognition
- spatio temporal
- high dimensional
- computer vision
- language model
- chinese characters
- vehicle detection
- text to speech
- image sequences
- e learning