A Whisper transformer for audio captioning trained with synthetic captions and transfer learning.
Marek KadlcíkAdam HájekJürgen KieslichRadoslaw WinieckiPublished in: CoRR (2023)
Keyphrases
- transfer learning
- knowledge transfer
- learning tasks
- cross domain
- labeled data
- active learning
- multi task learning
- reinforcement learning
- visual information
- machine learning
- collaborative filtering
- multi task
- multimedia
- visual features
- semi supervised learning
- transfer knowledge
- audio visual
- fuzzy logic
- manifold alignment
- transferring knowledge
- machine learning algorithms
- text classification
- training set
- learning algorithm
- domain adaptation
- text categorization
- target domain
- training data