SHMamba: Structured Hyperbolic State Space Model for Audio-Visual Question Answering.
Zhe YangWenrui LiGuanghui ChengPublished in: CoRR (2024)
Keyphrases
- audio visual
- state space model
- passage retrieval
- question answering
- multi modal
- state estimation
- visual information
- kalman filter
- autoregressive
- visual data
- information extraction
- natural language processing
- information retrieval
- structured data
- natural language
- named entities
- multimedia
- image data
- low level
- multimedia data
- data mining