A Systematic Evaluation of Large Language Models on Out-of-Distribution Logical Reasoning Tasks.
Qiming BaoGaël GendronAlex Yuxuan PengWanjun ZhongNeset TanYang ChenMichael WitbrockJiamou LiuPublished in: CoRR (2023)
Keyphrases
- language model
- reasoning tasks
- systematic evaluation
- language modeling
- description logics
- logic programming
- n gram
- query expansion
- probabilistic model
- temporal reasoning
- automated reasoning
- automatic query expansion
- answer set programming
- comprehensive evaluation
- retrieval model
- biomedical text
- document retrieval
- context sensitive
- information retrieval
- pseudo relevance feedback
- experimental evaluation
- test collection
- situation calculus
- logic programs
- query terms
- relevance model
- probability distribution
- image classification
- artificial intelligence