Multimodal attention networks for low-level vision-and-language navigation.
Federico LandiLorenzo BaraldiMarcella CorniaMassimiliano CorsiniRita CucchiaraPublished in: Comput. Vis. Image Underst. (2021)
Keyphrases
- low level vision
- image understanding
- high level vision
- markov random field
- image restoration
- object recognition
- computer vision
- energy minimization
- image analysis
- language learning
- social networks
- multi modal
- programming language
- natural language
- indoor environments
- neural network
- min cut max flow
- post processing
- pairwise
- stereo matching
- complex networks
- face recognition
- high level
- multimodal interaction
- multimedia