Tarantula - A Scalable and Extensible Web Spider.
Anshul SaxenaKeshav DubeySanjay K. DhurandherIsaac WoungangPublished in: KMIS (2009)
Keyphrases
- web mining
- web scale
- web crawling
- website
- web applications
- web resources
- web pages
- web documents
- semantic web
- world wide
- web users
- database
- massive scale
- information sources
- web information retrieval
- data model
- linked data
- highly flexible
- data mining
- web information
- data extraction
- web content
- markup language
- web usage mining
- link analysis
- user experience
- object oriented
- data structure
- information retrieval
- neural network