Login / Signup
RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning.
Junjie Ye
Yilong Wu
Songyang Gao
Caishuang Huang
Sixian Li
Guanyu Li
Xiaoran Fan
Qi Zhang
Tao Gui
Xuanjing Huang
Published in:
CoRR (2024)
Keyphrases
</>
language model
probabilistic model
speech recognition
test collection
n gram
context sensitive
machine learning
information retrieval
decision trees
hidden markov models
query terms
language modeling
vector space model
document length
statistical language models