Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features.
Byeonghu NaYoonsik KimSungrae ParkPublished in: CoRR (2021)
Keyphrases
- multi modal
- semantic features
- text recognition
- visual features
- semantic information
- visual information
- low level features
- optical character recognition
- viterbi algorithm
- video search
- low level
- text classification
- hidden markov models
- semantic concepts
- high dimensional
- semantic similarity
- image annotation
- visual content
- structural features
- image retrieval
- image classification
- wordnet
- high level
- information extraction
- image search
- document clustering
- information retrieval systems
- feature set
- key frames
- image processing
- image representation
- co occurrence