SpeechCLIP+: Self-supervised multi-task representation learning for speech via CLIP and speech-image data.
Hsuan-Fu WangYi-Jen ShihHeng-Jui ChangLayne BerryPuyuan PengHung-yi LeeHsin-Min WangDavid HarwathPublished in: CoRR (2024)
Keyphrases
- multi task
- learning tasks
- multi task learning
- image data
- multiple tasks
- learning process
- multitask learning
- learning problems
- learning algorithm
- training data
- reinforcement learning
- active learning
- supervised learning
- prior knowledge
- sparse learning
- machine learning
- high order
- unsupervised learning
- model selection
- image classification
- feature extraction