Space characters in Chinese semi-structured texts.
Rongzhou ShenClaire GroverEwan KleinPublished in: CIPS-SIGHAN (2010)
Keyphrases
- semi structured
- structured data
- information extraction
- data model
- web documents
- information integration
- data extraction
- chinese texts
- chinese characters
- wrapper generation
- structured knowledge
- semi structured data
- data collections
- free text
- web data sources
- xml databases
- text mining
- web data extraction
- html documents
- knowledge rich
- web sources
- web data
- content and structure
- named entities
- semi structured documents
- natural language
- knowledge representation