Text-to-Audio Grounding Based Novel Metric for Evaluating Audio Caption Similarity.
Swapnil BhosaleRupayan ChakrabortySunil Kumar KopparapuPublished in: CoRR (2022)
Keyphrases
- text graphics
- multimedia
- cross media retrieval
- distance measure
- signal processing
- human language
- visual data
- audio visual
- audio stream
- visual information
- similarity metric
- similarity measure
- audio content
- audio signals
- text to speech
- information retrieval systems
- visual features
- multimedia information
- edit distance
- text extraction
- distance function
- news video
- metric space
- semantic context
- semantic information
- feature vectors
- information retrieval