BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis Dataset.
Md. Istiak Hossain ShihabMd Rakibul HasanMahfuzur Rahman EmonSyed Mobassir HossenMd. Nazmuddoha AnsaryIntesur AhmedFazle Rabbi RakibShahriar Elahi DhruvoSouhardya Saha DipAkib Hasan PavelMarsia Haque MeghlaMd. Rezwanul HaqueSayma Sultana ChowdhuryFarig SadequeTahsin ReasatAhmed Imtiaz HumayunAsif Shahriyar SushmitPublished in: ICDAR (1) (2023)
Keyphrases
- multi domain
- cross domain
- search computing
- document images
- domain specific
- spoken dialogue systems
- heterogeneous networks
- document collections
- text documents
- news corpus
- information retrieval
- information retrieval systems
- document clustering
- cross language
- real world
- data sets
- role based access control
- network structure
- e learning
- indian languages