Information Extraction from HTML Product Catalogues: From Source Code and Images to RDF.
Martin LabskýVojtech SvátekOndrej SvábPavel PraksMichal KrátkýVáclav SnáselPublished in: Web Intelligence (2005)
Keyphrases
- source code
- information extraction
- open source
- open source software
- source files
- image data
- software systems
- software maintenance
- software engineers
- software projects
- free software
- natural language processing
- information retrieval
- mining software repositories
- software engineering
- data model
- maintenance activities
- execution traces
- machine learning
- raw data
- static analysis
- high level
- software evolution
- plagiarism detection
- impact analysis
- case study
- manual inspection
- website
- source code metrics