LEDGAR: A Large-Scale Multi-label Corpus for Text Classification of Legal Provisions in Contracts.
Don TuggenerPius von DänikenThomas PeetzMark CieliebakPublished in: LREC (2020)
Keyphrases
- multi label
- text classification
- text categorization
- text data
- multi label classification
- image annotation
- image classification
- binary classification
- classifier training
- multi instance
- machine learning
- multi label learning
- text mining
- text documents
- feature selection
- naive bayes
- bag of words
- graph cuts
- hierarchical text categorization
- protein function prediction
- multiple labels
- class labels
- multi modal
- image processing
- label assignment