Cross-Modality and Within-Modality Regularization for Audio-Visual Deepfake Detection.
Heqing ZouMeng ShenYuchen HuChen ChenEng Siong ChngDeepu RajanPublished in: ICASSP (2024)
Keyphrases
- audio visual
- multi modal
- visual information
- audio visual speech recognition
- visual data
- multi stream
- multimedia
- video summarization
- person authentication
- emotion recognition
- temporal context
- data sets
- co occurrence
- audio features
- text mining
- data management
- spatio temporal
- visual features
- high dimensional
- multimedia content
- multiscale