Multimodal city-verification on flickr videos using acoustic and textual features.

Howard Lei Jaeyoung Choi Gerald Friedland

Published in: ICASSP (2012)

Keyphrases

textual features
bag of words
multimodal biometrics
video sequences
geo referenced
video frames
social media
multi modal
photo collections
audio features
visual features
video data
multimedia
image retrieval
event recognition
video content
web pages
user generated
image classification
user generated content
image collections
key frames
probabilistic model