Login / Signup
LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark.
Zhenfei Yin
Jiong Wang
Jianjian Cao
Zhelun Shi
Dingning Liu
Mukai Li
Xiaoshui Huang
Zhiyong Wang
Lu Sheng
Lei Bai
Jing Shao
Wanli Ouyang
Published in:
NeurIPS (2023)
Keyphrases
</>
multi modal
multiple modalities
audio visual
object recognition
semantic concepts
multimedia
video sequences
image analysis
multi modality
fusing multiple
uni modal