Listen as you wish: Fusion of audio and text for cross-modal event detection in smart cities.

Published in: Inf. Fusion (2024)

Keyphrases