An Audio-Visual Dataset and Deep Learning Frameworks for Crowded Scene Classification.
Lam PhamDat NgoPhu X. NguyenTruong Van HoangAlexander SchindlerPublished in: CoRR (2021)
Keyphrases
- audio visual
- scene classification
- learning frameworks
- multi modal
- object recognition
- natural scenes
- image classification
- biologically inspired
- visual information
- image representation
- visual words
- visual data
- bag of features
- natural images
- active learning
- multimedia
- semi supervised
- kernel methods
- semi supervised learning
- low level
- multiscale
- information retrieval
- machine learning
- data mining
- human computer interaction
- data points
- probabilistic model
- computer vision