Speaker-adapted neural-network-based fusion for multimodal reference resolution.
Diana KleingarnNima NabizadehMartin HeckmannDorothea KolossaPublished in: SIGdial (2019)
Keyphrases
- reference resolution
- audio visual
- referring expressions
- anaphora resolution
- natural language processing
- relation extraction
- multimodal biometrics
- multi modal
- domain specific
- coreference resolution
- named entity recognition
- information extraction
- artificial intelligence
- natural language text
- visual information
- question answering
- dimensionality reduction
- general purpose
- knowledge base