Login / Signup

VILA: Improving Structured Content Extraction from Scientific PDFs Using Visual Layout Groups.

Zejiang ShenKyle LoLucy Lu WangBailey KuehlDaniel S. WeldDoug Downey
Published in: Trans. Assoc. Comput. Linguistics (2022)
Keyphrases
  • content extraction
  • web news
  • probability density function
  • visual features
  • text content
  • html documents
  • data mining
  • natural language processing