ASiT: Audio Spectrogram vIsion Transformer for General Audio Representation.
Sara AtitoMuhammad AwaisWenwu WangMark D. PlumbleyJosef KittlerPublished in: CoRR (2022)
Keyphrases
- multimedia
- visual information
- audio video
- audio visual
- special case
- audio stream
- image processing
- audio signals
- digital video
- closely related
- cepstral features
- wigner distribution
- music score
- cross modal
- vision system
- signal processing
- neural network
- data sets
- image representation
- visual data
- representation scheme
- fuzzy logic
- audio signal
- image sequences
- feature selection
- digital audio
- artificial intelligence
- real time