Vectorisation, Okapi et calcul de similarité pour le TAL : pour oublier enfin le TF-IDF (Vectorization, Okapi and Computing Similarity for NLP : Say Goodbye to TF-IDF) [in French].
Vincent ClaveauPublished in: JEP-TALN-RECITAL (2012)
Keyphrases
- tf idf
- cosine similarity
- weighting scheme
- information retrieval
- text documents
- text categorization
- retrieval model
- term frequency
- vector space model
- document clustering
- term weighting
- text mining
- natural language processing
- ranking algorithm
- inverse document frequency
- information extraction
- divergence from randomness
- similarity measure
- text retrieval
- language modelling
- semantic similarity
- question answering
- distance measure
- information retrieval systems
- similarity function
- text classification
- semi supervised
- knowledge representation
- ir models
- knowledge base