Login / Signup
WikiWeb2M: A Page-Level Multimodal Wikipedia Dataset.
Andrea Burns
Krishna Srinivasan
Joshua Ainslie
Geoff Brown
Bryan A. Plummer
Kate Saenko
Jianmo Ni
Mandy Guo
Published in:
CoRR (2023)
Keyphrases
</>
website
link structure
web pages
levels of abstraction
knowledge base
benchmark datasets
semantic information
named entities
anchor text
wikipedia pages
neural network
machine learning
text classification
multi modal
document collections
entity ranking