Bitext Mining Using Distilled Sentence Representations for Low-Resource Languages.
Kevin HeffernanOnur ÇelebiHolger SchwenkPublished in: CoRR (2022)
Keyphrases
- text summarization
- natural language
- target language
- resource allocation
- expressive power
- data mining
- frequent patterns
- language independent
- itemsets
- machine learning
- mining algorithm
- frequent itemsets
- data mining techniques
- knowledge discovery
- text mining
- cross lingual
- intermediate representations
- arabic language
- sentence level
- noun phrases
- resource constraints
- information extraction
- association rule mining
- co occurrence
- pattern mining
- sequential patterns
- web mining
- data mining algorithms