Mining the Web to Create Minority Language Corpora.
Rayid GhaniRosie JonesDunja MladenicPublished in: CIKM (2001)
Keyphrases
- web mining
- website
- clickstream data
- web logs
- unstructured information
- web applications
- natural language
- web usage
- text mining
- huge data
- mining algorithm
- natural language processing
- knowledge discovery
- parallel corpus
- log analysis
- web data
- specific domains
- sequential patterns
- web documents
- multilingual documents
- programming language
- web pages
- data mining
- pattern mining
- traversal patterns
- information sources
- data mining techniques
- web development
- linguistic resources
- web design
- social networks
- semantic web
- web resources
- linked data
- web content
- language learning
- keywords