Multi-modal Text Recognition Networks: Interactive Enhancements Between Visual and Semantic Features.
Byeonghu NaYoonsik KimSungrae ParkPublished in: ECCV (28) (2022)
Keyphrases
- multi modal
- semantic features
- text recognition
- visual features
- low level features
- visual information
- video search
- structural features
- semantic information
- image classification
- semantic concepts
- visual content
- optical character recognition
- viterbi algorithm
- text classification
- wordnet
- image annotation
- image retrieval
- document clustering
- high level
- hidden markov models
- domain knowledge
- high dimensional
- semantic similarity
- image search
- low level
- key frames
- bag of words
- computer vision
- active learning
- object recognition
- video sequences
- multiscale
- information extraction