Sound-Image Grounding Based Focusing Mechanism for Efficient Automatic Spoken Language Acquisition.
Mingxin ZhangTomohiro TanakaWenxin HouShengzhou GaoTakahiro ShinozakiPublished in: INTERSPEECH (2020)
Keyphrases
- language acquisition
- image data
- input image
- computational model
- single image
- image content
- image features
- image retrieval
- template matching
- image classification
- language learning
- image representation
- image collections
- feature points
- image regions
- segmentation method
- multiscale
- online learning
- edge detection
- artificial neural networks
- image pixels
- neural network
- natural language learning
- segmentation algorithm
- image analysis
- similarity measure
- face recognition
- image segmentation
- multimedia
- decision making