DyVal: Dynamic Evaluation of Large Language Models for Reasoning Tasks.
Kaijie ZhuJiaao ChenJindong WangNeil Zhenqiang GongDiyi YangXing XiePublished in: ICLR (2024)
Keyphrases
- language model
- reasoning tasks
- language modeling
- n gram
- description logics
- information retrieval
- context sensitive
- automated reasoning
- speech recognition
- language modelling
- document retrieval
- retrieval model
- logic programming
- probabilistic model
- test collection
- query expansion
- statistical language models
- temporal reasoning
- situation calculus
- answer set programming
- query terms
- smoothing methods
- language models for information retrieval
- artificial intelligence
- query specific
- expert systems
- word error rate
- high level
- okapi bm