American Stories: A Large-Scale Structured Text Dataset of Historical U.S. Newspapers.
Melissa DellJacob CarlsonTom BryanEmily SilcockAbhishek AroraZejiang ShenLuca D'Amico-WongQuan LePablo QuerubinLeander HeldringPublished in: NeurIPS (2023)
Keyphrases
- news articles
- text documents
- news stories
- narrative structure
- real world
- unstructured text
- information retrieval
- database
- real life
- structured data
- small scale
- data sets
- historical manuscripts
- textual data
- text data
- free text
- united states
- web pages
- text retrieval
- benchmark datasets
- topic tracking
- machine learning
- historical data
- training dataset
- web documents
- web scale
- text information
- image classification