Login / Signup
Reinforcement Learning with Human Feedback: Learning Dynamic Choices via Pessimism.
Zihao Li
Zhuoran Yang
Mengdi Wang
Published in:
CoRR (2023)
Keyphrases
</>
reinforcement learning
learning process
learning algorithm
learning systems
learning analytics
optimal control
neural network
machine learning
online learning
dynamic environments
learning problems
partially observable
robot control