Sign in

PIAVE: A Pose-Invariant Audio-Visual Speaker Extraction Network.

Qinghua LiuMeng GeZhizheng WuHaizhou Li
Published in: CoRR (2023)
Keyphrases
  • audio visual
  • multi modal
  • pose invariant
  • visual information
  • visual data
  • multimedia
  • emotion recognition
  • co occurrence
  • facial expression recognition