Assisting Developers to Read Code Help-Documents Efficiently through Discovering Document-section Relationships.
Lijie WangLeye WangGe LiBing XiePublished in: SEKE (2010)
Keyphrases
- document collections
- relevant documents
- source code
- document classification
- information retrieval systems
- document clustering
- web documents
- related documents
- document retrieval
- information retrieval
- text documents
- document processing
- electronic documents
- digital documents
- document representation
- cross document
- semi structured documents
- document structure
- document content
- document analysis
- multimedia documents
- document repository
- document similarity
- structured documents
- keywords
- vector space model
- retrieval systems
- document summarization
- document type
- statistical topic models
- retrieved documents
- xml format
- document ranking
- similar documents
- textual content
- document set
- scientific documents
- linux kernel
- text classifiers
- query terms
- term frequency
- topic hierarchy
- retrieval model
- unstructured documents
- open source
- document archives
- user queries
- index terms
- textual documents
- test collection
- text categorization
- latent topics
- latent semantic
- training documents
- semantic relationships
- xml trees
- language model
- printed documents
- tf idf
- pdf files
- digital libraries
- inverted index
- semantic information
- latent semantic analysis
- retrieval strategies
- document images
- information extraction
- text mining
- search engine
- document relevance
- text classification
- topic models
- multi document summarization
- document space
- software development
- scanned documents
- query expansion
- document level
- semantic similarity