PerPaDa: A Persian Paraphrase Dataset based on Implicit Crowdsourcing Data Collection.
Salar MohtajFatemeh TavakkoliHabibollah AsghariPublished in: CoRR (2022)
Keyphrases
- data collection
- benchmark datasets
- data analysis
- data sets
- information systems
- real life
- collected data
- training dataset
- text retrieval
- mechanical turk
- database
- crowd sourced
- collecting data
- synthetic datasets
- text classification
- sensor networks
- wireless sensor networks
- recommender systems
- feature space
- clustering algorithm