General-purpose audio tagging by ensembling convolutional neural networks based on multiple features.
Kevin WilkinghoffPublished in: DCASE (2018)
Keyphrases
- multiple features
- convolutional neural networks
- general purpose
- visual information
- data fusion
- multimedia
- feature space
- feature fusion
- metadata
- combining multiple
- feature vectors
- image retrieval
- metric learning
- multiple instance learning
- low level
- feature set
- multiscale
- multiple views
- incremental learning
- relevance feedback
- image classification
- visual features
- image processing
- learning algorithm
- pairwise
- pattern recognition
- high level
- search engine