• search
    search
  • reviewers
    reviewers
  • feeds
    feeds
  • assignments
    assignments
  • settings
  • logout

Attention guided deep audio-face fusion for efficient speaker naming.

Xin LiuJiajia GengHaibin LingYiu-ming Cheung
Published in: Pattern Recognit. (2019)
Keyphrases
  • audio visual
  • multimodal fusion
  • cost effective
  • face images
  • signal processing
  • multimedia
  • pattern recognition
  • computationally efficient
  • speech recognition
  • keypoints
  • broadcast news
  • visual speech
  • audio stream