Smart Bilingual Focused Crawling of Parallel Documents.
Cristian García-RomeroMiquel Esplà-GomisFelipe Sánchez-MartínezPublished in: CoRR (2024)
Keyphrases
- focused crawling
- web documents
- focused crawler
- topic specific
- text content
- semantic information
- web pages
- keywords
- machine translation
- web sources
- semi structured
- web mining
- web queries
- web data
- text documents
- cross lingual
- search tools
- machine learning
- anchor text
- relevant documents
- information retrieval systems
- information extraction
- information retrieval