PPTC-R benchmark: Towards Evaluating the Robustness of Large Language Models for PowerPoint Task Completion.

Published in: CoRR (2024)

Keyphrases