A Core Calculus for Documents: Or, Lambda: The Ultimate Document.
Will CrichtonShriram KrishnamurthiPublished in: Proc. ACM Program. Lang. (2024)
Keyphrases
- document collections
- relevant documents
- text documents
- document clustering
- document classification
- web documents
- information retrieval
- information retrieval systems
- digital documents
- document representation
- document similarity
- electronic documents
- semi structured documents
- document type
- document retrieval
- document content
- document processing
- document structure
- vector space model
- structured documents
- retrieval systems
- retrieved documents
- document summarization
- similar documents
- document ranking
- document repository
- document analysis
- multimedia documents
- index terms
- document set
- document centric
- keywords
- document archives
- textual content
- term frequency
- unstructured documents
- user queries
- xml format
- related documents
- pdf documents
- keyword extraction
- scientific documents
- retrieval strategies
- latent semantic analysis
- xml documents
- pdf files
- latent topics
- text collections
- text categorization
- printed documents
- query terms
- document space
- maximal marginal relevance
- training documents
- metadata
- textual documents
- document level
- text mining
- ranked list
- logical structure
- digital libraries
- relevance feedback
- retrieval model
- document relevance
- handwritten documents
- multi document summarization
- information extraction
- document images
- test collection
- query expansion
- text classification
- query biased
- text classifiers
- co occurrence
- automatic text classification
- cross references
- scanned documents
- topic hierarchy
- semantic information
- vector space
- inverted index
- relevant content
- tf idf