Multi-encoder attention-based architectures for sound recognition with partial visual assistance.
Wim BoesHugo Van hammePublished in: CoRR (2022)
Keyphrases
- selective attention
- recognition rate
- visual learning
- visual perception
- recognition accuracy
- object recognition
- recognition algorithm
- visual processing
- visual information
- visual attention
- bit rate
- low complexity
- image recognition
- visual field
- visual search
- visual recognition
- visual features
- recognition process
- pattern recognition
- computer vision
- low level
- focus of attention
- video compression
- hidden markov models
- rate control
- action recognition
- gesture recognition
- character recognition
- rate distortion