Language ID in the Context of Harvesting Language Data off the Web.
Fei XiaWilliam D. LewisHoifung PoonPublished in: EACL (2009)
Keyphrases
- data sets
- database
- data processing
- language learning
- data collection
- data structure
- high quality
- data quality
- raw data
- information sources
- image data
- data points
- data analysis
- data extraction
- input data
- web pages
- high dimensional data
- synthetic data
- databases
- log data
- context dependent
- web data
- linked open data
- computer systems
- training data
- programming language
- knowledge discovery
- natural language