Login / Signup

Aggregating Frame-Level Information in the Spectral Domain With Self-Attention for Speaker Embedding.

Youzhi TuMan-Wai Mak
Published in: IEEE ACM Trans. Audio Speech Lang. Process. (2022)
Keyphrases
  • spatial information
  • feature extraction
  • image processing