Tie Your Embeddings Down: Cross-Modal Latent Spaces for End-to-end Spoken Language Understanding.
Bhuvan AgrawalMarkus MüllerMartin RadfarSamridhi ChoudharyAthanasios MouchtarisSiegfried KunzmannPublished in: CoRR (2020)
Keyphrases
- language understanding
- end to end
- cross modal
- multi modal
- natural language understanding
- language processing
- multimedia retrieval
- image retrieval
- natural language
- visual recognition
- visual data
- dialogue system
- semantic interpretation
- spoken dialogue systems
- multimedia databases
- low dimensional
- general knowledge
- dimensionality reduction
- high dimensional data
- visual similarity
- cognitive psychology
- multimedia