An Audio-Visual Dataset and Deep Learning Frameworks for Crowded Scene Classification.
Lam PhamDat NgoTho NguyenPhu X. NguyenTruong Van HoangAlexander SchindlerPublished in: CBMI (2022)
Keyphrases
- audio visual
- scene classification
- learning frameworks
- object recognition
- multi modal
- image classification
- biologically inspired
- natural scenes
- visual information
- image representation
- visual words
- visual data
- multimedia
- active learning
- bag of features
- visual features
- natural images
- kernel methods
- image sequences
- human computer interaction
- data sets
- spatial information
- bag of words
- multiscale
- feature extraction
- visual content
- video sequences
- image features
- image processing
- computer vision