Login / Signup

Evaluating Large Language Models with Grid-Based Game Competitions: An Extensible LLM Benchmark and Leaderboard.

Oguzhan TopsakalColby Jacob EdellJackson Bailey Harper
Published in: CoRR (2024)
Keyphrases