BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents.
Teakgyu HongDonghyun KimMingi JiWonseok HwangDaehyun NamSungrae ParkPublished in: AAAI (2022)
Keyphrases
- language model
- information retrieval
- information extraction
- text documents
- pre trained
- free text
- document level
- document retrieval
- web documents
- language modeling
- text mining
- textual data
- ad hoc information retrieval
- vector space model
- query terms
- n gram
- text retrieval
- natural language text
- document ranking
- retrieval model
- speech recognition
- multiword
- query expansion
- information retrieval systems
- probabilistic model
- relevance model
- document collections
- natural language processing
- question answering
- test collection
- relevant documents
- word clouds
- keywords
- training data
- translation model
- document analysis
- named entities
- query specific
- textual content
- document set
- text summarization
- training examples
- machine learning
- machine translation
- pseudo relevance feedback
- term frequency
- search engine
- tf idf
- text classification
- smoothing methods
- control signals
- retrieval systems