Audio-Visual Segmentation with Semantics.
Jinxing ZhouXuyang ShenJianyuan WangJiayi ZhangWeixuan SunJing ZhangStan BirchfieldDan GuoLingpeng KongMeng WangYiran ZhongPublished in: CoRR (2023)
Keyphrases
- audio visual
- multi modal
- temporal segmentation
- visual information
- multimedia
- image segmentation
- visual data
- temporal context
- audio visual speech recognition
- emotion recognition
- semantic information
- multiscale
- hidden markov models
- video summarization
- face recognition
- multi stream
- person authentication
- image regions
- knowledge base
- multimodal fusion
- data sets