Hypergraph Multi-modal Large Language Model: Exploiting EEG and Eye-tracking Modalities to Evaluate Heterogeneous Responses for Video Understanding.
Minghui WuChenxu ZhaoAnyang SuDonglin DiTianyu FuDa AnMin HeYa GaoMeng MaKun YanPing WangPublished in: CoRR (2024)
Keyphrases
- multi modal
- eye tracking
- language model
- multiple modalities
- eye tracking data
- cross modal
- single modality
- video search
- semantic concepts
- eye tracker
- probabilistic model
- speech recognition
- eye movements
- document retrieval
- query expansion
- retrieval model
- visual attention
- human computer interaction
- information retrieval
- test collection
- multi modality
- video streams
- audio visual
- video data
- eye gaze
- video sequences
- query terms
- visual analysis
- translation model
- image annotation
- visual data
- video content
- visual cues
- high dimensional
- bayesian networks
- feature selection
- computer vision
- machine learning