MSG-BART: Multi-Granularity Scene Graph-Enhanced Encoder-Decoder Language Model for Video-Grounded Dialogue Generation.
Hongcheng LiuZhe ChenHui LiPingjie WangYanfeng WangYu WangPublished in: ICASSP (2024)
Keyphrases
- language model
- multi granularity
- video sequences
- language modeling
- motion estimation
- low complexity
- bit budget
- video encoder
- video codec
- distributed video coding
- n gram
- multi user
- query expansion
- information retrieval
- probabilistic model
- dynamic integration
- retrieval model
- video data
- wyner ziv
- test collection
- real time
- rate distortion
- motion vectors
- multimedia
- video frames
- video content
- mixture model
- moving objects
- natural language
- bit rate
- image sequences
- location aware
- ad hoc information retrieval
- key frames
- context sensitive
- video coding
- smoothing methods
- semi supervised