Improving speech embedding using crossmodal transfer learning with audio-visual data.
Nam LeJean-Marc OdobezPublished in: Multim. Tools Appl. (2019)
Keyphrases
- visual data
- transfer learning
- audio visual
- visual information
- active learning
- labeled data
- machine learning
- contextual information
- collaborative filtering
- visual features
- cross domain
- video sequences
- image data
- machine learning algorithms
- video data
- high dimensional data
- reinforcement learning
- activity recognition
- text categorization
- transfer knowledge
- multimedia data
- text classification
- learning algorithm
- high dimensional
- human actions
- human motion
- visual content
- unlabeled data
- eye movements
- databases
- language model
- high level
- feature selection
- computer vision
- data mining