Who Said That?: Audio-Visual Speaker Diarisation of Real-World Meetings.
Joon Son ChungBong-Jin LeeIcksang HanPublished in: INTERSPEECH (2019)
Keyphrases
- audio visual
- multi modal
- meeting room
- visual information
- speaker verification
- multi stream
- person authentication
- multimedia
- emotion recognition
- visual data
- data mining
- temporal context
- audio visual speech recognition
- data sets
- activity recognition
- data processing
- audio features
- text mining
- high level
- feature selection