CCpdf: Building a High Quality Corpus for Visually Rich Documents from Web Crawl Data.

Published in: CoRR (2023)

Keyphrases