Multi-encoder attention-based architectures for sound recognition with partial visual assistance.
Wim BoesHugo Van hammePublished in: EURASIP J. Audio Speech Music. Process. (2022)
Keyphrases
- selective attention
- visual learning
- visual perception
- object recognition
- visual processing
- recognition rate
- visual attention
- recognition accuracy
- visual information
- visual features
- recognition algorithm
- image recognition
- recognition process
- visual search
- character recognition
- video codec
- automatic recognition
- video compression
- gesture recognition
- partial occlusion
- low complexity
- rate distortion
- feature extraction
- neural network
- image classification
- visual field
- motion estimation
- handwritten characters