Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings.
Naoyuki KandaJian WuYu WuXiong XiaoZhong MengXiaofei WangYashesh GaurZhuo ChenJinyu LiTakuya YoshiokaPublished in: CoRR (2022)
Keyphrases
- automatic speech recognition
- speech recognition
- speaker verification
- hidden markov models
- audio visual
- speaker recognition
- speaker identification
- noisy environments
- speech signal
- information retrieval
- vector space
- low dimensional
- data streams
- data sets
- high dimensional
- pattern recognition
- speech recognizer
- speaker diarization
- prosodic features
- speech sounds