Login / Signup
Beyond Human Preferences: Exploring Reinforcement Learning Trajectory Evaluation and Improvement through LLMs.
Zichao Shen
Tianchen Zhu
Qingyun Sun
Shiqi Gao
Jianxin Li
Published in:
CoRR (2024)
Keyphrases
</>
reinforcement learning
decision making
significant improvement
dynamic programming
markov decision processes
neural network
multi agent
human subjects
evaluation criteria
utility function
evaluation metrics
evaluation model