Sign in

Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation.

Shih-Lun WuXuankai ChangGordon WichernJee-weon JungFrançois G. GermainJonathan Le RouxShinji Watanabe
Published in: CoRR (2023)
Keyphrases