Pile of Law: Learning Responsible Data Filtering from the Law and a 256GB Open-Source Legal Dataset.
Peter HendersonMark S. KrassLucia ZhengNeel GuhaChristopher D. ManningDan JurafskyDaniel E. HoPublished in: CoRR (2022)
Keyphrases
- open source
- data sets
- original data
- learning process
- database
- prior knowledge
- learning algorithm
- data processing
- data collection
- input data
- data points
- active learning
- data analysis
- case law
- data quality
- high quality
- statistical analysis
- missing data
- training data
- test data
- sensor data
- learning systems
- synthetic data
- image data
- raw data
- high dimensional data
- online learning
- training dataset
- learned models
- legal reasoning
- source code
- artificial intelligence and law