Panda LLM: Training Data and Evaluation for Open-Sourced Chinese Instruction-Following Large Language Models.
Fangkai JiaoBosheng DingTianze LuoZhanfeng MoPublished in: CoRR (2023)
Keyphrases
- language model
- training data
- language modeling
- n gram
- information retrieval
- language modelling
- test collection
- document retrieval
- probabilistic model
- retrieval model
- statistical language models
- speech recognition
- decision trees
- ad hoc information retrieval
- language models for information retrieval
- document ranking
- word segmentation
- relevance model
- smoothing methods
- context sensitive
- query terms
- training set
- learning algorithm
- query expansion
- multimedia
- relevance assessments
- vector space model
- term dependencies
- evaluation metrics
- active learning
- okapi bm
- search engine
- spoken term detection