AudioTime: A Temporally-aligned Audio-text Benchmark Dataset.

Zeyu Xie Xuenan Xu Zhizheng Wu Mengyue Wu

Published in: CoRR (2024)

Keyphrases

benchmark datasets
text graphics
multimedia
spatio temporal
audio content
text retrieval
human language
audio visual
text mining
keywords
temporal information
cross media retrieval
spoken documents
signal processing
free text
web documents
database
visual information
document analysis
multimedia information
cross modal
text to speech
spontaneous speech
natural language processing
web pages