Scaled Document Clustering and Word Cloud-Based Summarization on Hindi Corpus.
Prafulla B. BafnaJatinderkumar R. SainiPublished in: ICACIE (2) (2019)
Keyphrases
- document clustering
- statistical machine translation
- document understanding
- parallel corpus
- cross lingual
- document corpus
- automatic summarization
- word pairs
- similar documents
- machine translation
- text mining
- sentence level
- tf idf
- translation model
- document clusters
- clustering algorithm
- clustering method
- text documents
- document collections
- comparable corpora
- vector space model
- document representation
- noun phrases
- machine translation system
- multi document summarization
- topic detection
- text corpora
- parallel corpora
- document similarity
- source language
- cosine similarity
- language independent
- k means
- co occurrence
- information extraction
- cluster analysis
- named entity recognition
- keywords
- semantic relations
- machine learning
- natural language processing
- information retrieval systems
- document level
- n gram
- target language