Login / Signup
Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation.
Siyuan Wang
Zhuohan Long
Zhihao Fan
Zhongyu Wei
Xuanjing Huang
Published in:
CoRR (2024)
Keyphrases
</>
dynamic environments
databases
real world
real time
data mining
machine learning
artificial intelligence
evaluation process
continuously changing
dynamically changing
evaluation model
evaluation methods
evaluation method
empirical evaluation
medical images
wide range
computer vision