Evaluating BERT's Encoding of Intrinsic Semantic Features of OCR'd Digital Library Collections.
Ming JiangYuerong HuGlen WortheyRyan C. DubnicekTed UnderwoodJ. Stephen DowniePublished in: JCDL (2021)
Keyphrases
- semantic features
- semantic information
- visual features
- linguistic features
- wordnet
- structural features
- low level features
- text classification
- optical character recognition
- semantic similarity
- syntactic features
- document clustering
- parse selection
- feature set
- document images
- semantic models
- higher level
- knowledge base
- multiscale
- image processing