Login / Signup

Multimodal Speaker Segmentation and Diarization Using Lexical and Acoustic Cues via Sequence to Sequence Neural Networks.

Tae Jin ParkPanayiotis G. Georgiou
Published in: INTERSPEECH (2018)
Keyphrases
  • neural network
  • fuzzy logic
  • medical images
  • level set
  • video sequences
  • artificial neural networks
  • markov random field
  • prosodic features