Login / Signup

RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning.

Junjie YeYilong WuSongyang GaoCaishuang HuangSixian LiGuanyu LiXiaoran FanQi ZhangTao GuiXuanjing Huang
Published in: CoRR (2024)
Keyphrases