The first step is the hardest: Pitfalls of Representing and Tokenizing Temporal Data for Large Language Models.
Dimitris SpathisFahim KawsarPublished in: CoRR (2023)
Keyphrases
- language model
- temporal data
- language modeling
- temporal databases
- n gram
- temporal information
- retrieval model
- document retrieval
- language modelling
- statistical language models
- query expansion
- spatial data
- probabilistic model
- speech recognition
- information retrieval
- test collection
- document ranking
- data streams
- query terms
- temporal patterns
- language models for information retrieval
- co occurrence
- relevant documents
- relevance model
- association rule mining
- image classification