Extending Spark Analytics through Tika-Based Information Extraction and Retrieval.
Rishi VermaChris MattmannPublished in: IRI (2015)
Keyphrases
- information extraction
- information retrieval
- natural language processing
- image retrieval
- image database
- machine learning
- named entity recognition
- multimedia databases
- precision and recall
- structured data
- question answering
- web documents
- query expansion
- data mining
- text mining
- relevance feedback
- open domain
- text retrieval
- free text
- relational learning
- structured documents
- retrieval accuracy
- data management
- retrieval model
- semi structured
- web mining
- big data
- word sense disambiguation
- named entities
- conditional random fields
- business intelligence
- content based retrieval
- text summarization
- wordnet
- cloud computing
- information retrieval systems