NaVid: Video-based VLM Plans the Next Step for Vision-and-Language Navigation.
Jiazhao ZhangKunyu WangRongtao XuGengze ZhouYicong HongXiaomeng FangQi WuZhizheng ZhangHe WangPublished in: CoRR (2024)
Keyphrases
- programming language
- computer vision
- artificial intelligence
- natural language
- planning process
- autonomous navigation
- vision system
- real time
- language learning
- computational linguistics
- language processing
- plan recognition
- specification language
- post processing
- operational semantics
- case based planning
- model checking
- human vision
- planning problems
- logic programs
- image processing
- machine learning
- databases